# Universal Inference: Review II

In Part I of this series, we covered some of the prerequisites for understanding the new paper Universal Inference by Wasserman et al. In particular, we reviewed the classical likelihood ratio test and worked some specific examples. In this post, we’ll look at the split likelihood ratio test which is the key idea behind universal… Read More Universal Inference: Review II

# Universal Inference: Review Part I

There’s an important new paper out by Larry Wasserman et al. that describes a very general technique, called Universal Inference, for constructing statistical hypothesis tests and confidence intervals. In the traditional theory of statistics, such as would be taught in an undergraduate mathematical statistics course, a standard way hypothesis tests are constructed and analyzed is… Read More Universal Inference: Review Part I

# Recommendation Systems: From Co-occurrence Counts to Probabilities

In the previous post, we demonstrated how to efficiently compute co-occurrences with matrix algebra and use those calculations to recommend books to users. Though we saw some sensible recommendations come out of this approach, it also suffers from a number of issues, including: The Gatsby Problem: popular books tend to be over represented in the… Read More Recommendation Systems: From Co-occurrence Counts to Probabilities

# Recommendation Systems: A Co-occurrence Recommender

In the previous post of the series, we developed a co-occurence model for book recommendations. The model is similar to Amazon’s highly successful “Customers who bought…” feature. Now it’s time to apply this simple model to some real data to make recommendations. Preprocessing As usual, we’ll work with the Goodreads dataset. I described the structure… Read More Recommendation Systems: A Co-occurrence Recommender

# Financial Turbulence: Off the Chart

I recently read Kritzmen and Li’s clever 2010 paper Skulls, Financial Turbulence, and Risk Management. Kritzmen and Li characterize financial turbulence as a period where established financial relationships uncouple, prices swing, and market predictions break down. Does that sound like financial markets in 2020? Yup. So I thought it would be interesting to take a… Read More Financial Turbulence: Off the Chart

# Recommendation Systems: Co-occurrence Calculations

In this first post in a series on recommendation systems, we’re going to develop a powerful but highly intuitive representation for user behavior that will allow us to easily make recommendations. Since we’re going to be making heavy use of the Goodreads data set in the series, we’ll formulate our basic recommendation system problem as… Read More Recommendation Systems: Co-occurrence Calculations

# A Practical Series on Recommendation Systems

Today, I am announcing a series of posts I am developing about recommendation systems. The series is aimed at software/machine learning engineers. I have two goals for the series: Provide practical and implementable strategies for delivering recommendations in real-time Present the mathematical intuition behind recommender problems The reason for the first goal is that I… Read More A Practical Series on Recommendation Systems

# Can you norm rows and standardize columns at the same time?

If you are reading this, you probably already know that data pre-processing is the 90% perspiration of machine learning. You might love it or you might dread it, but you probably don’t think of it as a the part of ML where the most interesting mathematics lives. Let me challenge that view a bit with… Read More Can you norm rows and standardize columns at the same time?

# Mathematicians in Machine Learning

On February 28, I presented at the University of Kentucky’s Mathematics Department Alumni Day. My talk contains practical advice for math students (graduate and undergraduate) to prepare for Machine Learning careers.

# A Jackknife Example

I’m working through Wasserman’s All of Nonparametric Statistics, a wonderful and concise tour of nonparametric techniques. What is nonparametric statistics? It is a collection of estimation techniques that make as few assumptions as possible about the distribution from which your data came. Let’s work through an example in R that’s mentioned in Chapter 3 of… Read More A Jackknife Example