In the previous post of the series, we developed a co-occurence model for book recommendations. The model is similar to Amazon’s highly successful “Customers who bought…” feature. Now it’s time to apply this simple model to some real data to make recommendations. Preprocessing As usual, we’ll work with the Goodreads dataset. I described the structure… Read More Recommendation Systems: A Co-occurrence Recommender
I recently read Kritzmen and Li’s clever 2010 paper Skulls, Financial Turbulence, and Risk Management. Kritzmen and Li characterize financial turbulence as a period where established financial relationships uncouple, prices swing, and market predictions break down. Does that sound like financial markets in 2020? Yup. So I thought it would be interesting to take a… Read More Financial Turbulence: Off the Chart
In this first post in a series on recommendation systems, we’re going to develop a powerful but highly intuitive representation for user behavior that will allow us to easily make recommendations. Since we’re going to be making heavy use of the Goodreads data set in the series, we’ll formulate our basic recommendation system problem as… Read More Recommendation Systems: Co-occurrence Calculations
Today, I am announcing a series of posts I am developing about recommendation systems. The series is aimed at software/machine learning engineers. I have two goals for the series: Provide practical and implementable strategies for delivering recommendations in real-time Present the mathematical intuition behind recommender problems The reason for the first goal is that I… Read More A Practical Series on Recommendation Systems
If you are reading this, you probably already know that data pre-processing is the 90% perspiration of machine learning. You might love it or you might dread it, but you probably don’t think of it as a the part of ML where the most interesting mathematics lives. Let me challenge that view a bit with… Read More Can you norm rows and standardize columns at the same time?
On February 28, I presented at the University of Kentucky’s Mathematics Department Alumni Day. My talk contains practical advice for math students (graduate and undergraduate) to prepare for Machine Learning careers.
I’m working through Wasserman’s All of Nonparametric Statistics, a wonderful and concise tour of nonparametric techniques. What is nonparametric statistics? It is a collection of estimation techniques that make as few assumptions as possible about the distribution from which your data came. Let’s work through an example in R that’s mentioned in Chapter 3 of… Read More A Jackknife Example
In the past, I wrote frequently about quadratic programming especially in R, for example here and here. It’s been a while and at least one great new library has emerged since my last post on quadratic programming — OSQP. OSQP introduces a new technique called operator splitting which offers significant performance improvements over standard interior… Read More Sparse quadratic programming with osqp
So you’ve filled an S3 bucket with hundreds of millions of objects and now, for one reason or another, you need to copy that data into another S3 bucket. How do you do that efficiently? If you’ve worked with S3 for awhile, you’ve probably seen how painfully slow it is to list all of the… Read More Copying lots of data between S3 buckets
Reading through papers on the Word2vec skip-gram model, I found myself confused on a fairly uninteresting point in the mechanics of the output layer. What was never made explicit enough (at least to me) is that the output layer returns the exact same output distribution for each context. To see why this must be true,… Read More Word2Vec: Skip-Gram Feedforward Architecture