Optimizers as Dynamical Systems

The ideas in this post were hashed out during a series of discussions between myself and Bruno Gavranović Consider a system for forecasting a time series in \(\mathbb{R}\) based on a vector of features in \(\mathbb{R}^a\). At each time \(t\) this system will use the state of the world (represented... [Read More]
Tags: Machine Learning, Category Theory, Lens, Dynamical System, Gradient Descent

Supervised Clustering With Kan Extensions

Clustering algorithms allow us to group points in a dataset together based on some notion of similarity between them. Formally, we can consider a clustering algorithm as mapping a metric space \((X, d_X)\) (representing data) to a partitioning of \(X\). In most applications of clustering the points in the metric... [Read More]
Tags: Clustering, Machine Learning, Extrapolation, Kan Extension, Category Theory, Functorial

Transformation Invariant Continuous Optimization Algorithms

Continuous Optimization Algorithms Suppose we have a function \(l: \mathbb{R}^n \rightarrow \mathbb{R}\) that we want to minimize. A popular algorithm for accomplishing this is gradient descent, which is an iterative algorithm in which we pick a step size \(\alpha\) and a starting point \(x_0 \in \mathbb{R}^n\) and repeatedly iterate \(x_{t+\alpha}... [Read More]
Tags: Gradient Descent, Differential Equations, Euler's Method

Gradient Descent Is Euler's Method

Gradient Descent Gradient descent is a technique for iteratively minimizing a convex function \(f: \mathbb{R}^n \rightarrow \mathbb{R}\) by repeatedly taking steps along its gradient. We define the gradient of \(f\) to be the unique function \(\nabla f\) that satisfies: \[lim_{p \rightarrow 0} \frac{f(x+p) - f(x) - \nabla f(x)^{T}p}{\|p\|} = 0\]... [Read More]
Tags: Gradient Descent, Differential Equations, Euler's Method

Stability of Mapper Graph Invariants

Introduction The Mapper algorithm is a useful tool for identifying patterns in a large dataset by generating a graph summary. We can describe the Mapper algorithm as constructing a discrete approximation of the Reeb graph: Suppose we have a manifold \(\mathbf{X}\) equipped with a distance metric \(d_{\mathbf{X}}\) (such as a... [Read More]
Tags: Mapper, TDA, Topological Data Analaysis, Machine Learning