Improving a Machine Learning System (Part 1 - Broken Abstractions)

This post is part one in a three part series on the challenges of improving a production machine learning system. Find part two here and part three here. Suppose you have been hired to apply state of the art machine learning technology to improve the Foo vs Bar classifier at... [Read More]
Tags: Machine Learning, Machine Learning Systems, Abstractions

Optimizers as Dynamical Systems

The ideas in this post were hashed out during a series of discussions between myself and Bruno Gavranović Consider a system for forecasting a time series in \(\mathbb{R}\) based on a vector of features in \(\mathbb{R}^a\). At each time \(t\) this system will use the state of the world (represented... [Read More]
Tags: Machine Learning, Category Theory, Lens, Dynamical System, Gradient Descent

Supervised Clustering With Kan Extensions

Clustering algorithms allow us to group points in a dataset together based on some notion of similarity between them. Formally, we can consider a clustering algorithm as mapping a metric space \((X, d_X)\) (representing data) to a partitioning of \(X\). In most applications of clustering the points in the metric... [Read More]
Tags: Clustering, Machine Learning, Extrapolation, Kan Extension, Category Theory, Functorial

Transformation Invariant Continuous Optimization Algorithms

Continuous Optimization Algorithms Suppose we have a function \(l: \mathbb{R}^n \rightarrow \mathbb{R}\) that we want to minimize. A popular algorithm for accomplishing this is gradient descent, which is an iterative algorithm in which we pick a step size \(\alpha\) and a starting point \(x_0 \in \mathbb{R}^n\) and repeatedly iterate \(x_{t+\alpha}... [Read More]
Tags: Gradient Descent, Differential Equations, Euler's Method

Gradient Descent Is Euler's Method

Gradient Descent Gradient descent is a technique for iteratively minimizing a convex function \(f: \mathbb{R}^n \rightarrow \mathbb{R}\) by repeatedly taking steps along its gradient. We define the gradient of \(f\) to be the unique function \(\nabla f\) that satisfies: \[lim_{p \rightarrow 0} \frac{f(x+p) - f(x) - \nabla f(x)^{T}p}{\|p\|} = 0\]... [Read More]
Tags: Gradient Descent, Differential Equations, Euler's Method