I recently read Kritzmen and Li’s clever 2010 paper Skulls, Financial Turbulence, and Risk Management. Kritzmen and Li characterize financial turbulence as a period where established financial relationships uncouple, prices swing, and market predictions break down. Does that sound like financial markets in 2020? Yup. So I thought it would be interesting to take a… Read More Financial Turbulence: Off the Chart
In this first post in a series on recommendation systems, we’re going to develop a powerful but highly intuitive representation for user behavior that will allow us to easily make recommendations. Since we’re going to be making heavy use of the Goodreads data set in the series, we’ll formulate our basic recommendation system problem as… Read More Recommendation Systems: Co-occurrence Calculations
Today, I am announcing a series of posts I am developing about recommendation systems. The series is aimed at software/machine learning engineers. I have two goals for the series: Provide practical and implementable strategies for delivering recommendations in real-time Present the mathematical intuition behind recommender problems The reason for the first goal is that I… Read More A Practical Series on Recommendation Systems
If you are reading this, you probably already know that data pre-processing is the 90% perspiration of machine learning. You might love it or you might dread it, but you probably don’t think of it as a the part of ML where the most interesting mathematics lives. Let me challenge that view a bit with… Read More Can you norm rows and standardize columns at the same time?
On February 28, I presented at the University of Kentucky’s Mathematics Department Alumni Day. My talk contains practical advice for math students (graduate and undergraduate) to prepare for Machine Learning careers.
I’m working through Wasserman’s All of Nonparametric Statistics, a wonderful and concise tour of nonparametric techniques. What is nonparametric statistics? It is a collection of estimation techniques that make as few assumptions as possible about the distribution from which your data came. Let’s work through an example in R that’s mentioned in Chapter 3 of… Read More A Jackknife Example
In the past, I wrote frequently about quadratic programming especially in R, for example here and here. It’s been a while and at least one great new library has emerged since my last post on quadratic programming — OSQP. OSQP introduces a new technique called operator splitting which offers significant performance improvements over standard interior… Read More Sparse quadratic programming with osqp
A few months ago, I saw this post by Chris Krycho on Hackernews, which points out just how easy it can be to share Rust binaries with friends. The use case really spoke to me — I spend a lot of time building prototypes and I often need to share my prototypes with a few… Read More Rust: Cross compiling from Ubuntu to OS X
A few years ago, I wrote about how to analyze the 2012 California Health Interview Survey in R. In 2012, plans for Covered California (Obamacare in California) were just beginning to take shape. Today, Covered California is a relatively mature program and it is arguably the most successful implementation of the Affordable Care Act in… Read More Analyzing the 2015 California Health Interview Survey in R
In this post, we’ll look at a simple method to identify segments of an image based on RGB color values. The segmentation technique we’ll consider is called color quantization. Not surprisingly, this topic lends itself naturally to visualization and R makes it easy to render some really cool graphics for the color quantization problem. The… Read More Color Quantization in R