Universal Inference: Review Part I

There’s an important new paper out by Larry Wasserman et al. that describes a very general technique, called Universal Inference, for constructing statistical hypothesis tests and confidence intervals. In the traditional theory of statistics, such as would be taught in an undergraduate mathematical statistics course, a standard way hypothesis tests are constructed and analyzed is… Read More Universal Inference: Review Part I

Recommendation Systems: From Co-occurrence Counts to Probabilities

In the previous post, we demonstrated how to efficiently compute co-occurrences with matrix algebra and use those calculations to recommend books to users. Though we saw some sensible recommendations come out of this approach, it also suffers from a number of issues, including: The Gatsby Problem: popular books tend to be over represented in the… Read More Recommendation Systems: From Co-occurrence Counts to Probabilities

Recommendation Systems: A Co-occurrence Recommender

In the previous post of the series, we developed a co-occurence model for book recommendations. The model is similar to Amazon’s highly successful “Customers who bought…” feature. Now it’s time to apply this simple model to some real data to make recommendations. Preprocessing As usual, we’ll work with the Goodreads dataset. I described the structure… Read More Recommendation Systems: A Co-occurrence Recommender

Recommendation Systems: Co-occurrence Calculations

In this first post in a series on recommendation systems, we’re going to develop a powerful but highly intuitive representation for user behavior that will allow us to easily make recommendations. Since we’re going to be making heavy use of the Goodreads data set in the series, we’ll formulate our basic recommendation system problem as… Read More Recommendation Systems: Co-occurrence Calculations

A Practical Series on Recommendation Systems

Today, I am announcing a series of posts I am developing about recommendation systems. The series is aimed at software/machine learning engineers. I have two goals for the series: Provide practical and implementable strategies for delivering recommendations in real-time Present the mathematical intuition behind recommender problems The reason for the first goal is that I… Read More A Practical Series on Recommendation Systems

A Jackknife Example

I’m working through Wasserman’s All of Nonparametric Statistics, a wonderful and concise tour of nonparametric techniques. What is nonparametric statistics? It is a collection of estimation techniques that make as few assumptions as possible about the distribution from which your data came. Let’s work through an example in R that’s mentioned in Chapter 3 of… Read More A Jackknife Example