## Gradient Descent Is Euler's Method

Posted on November 28, 2020

Gradient Descent Gradient descent is a technique for iteratively minimizing a convex function \(f: \mathbb{R}^n \rightarrow \mathbb{R}\) by repeatedly taking steps along its gradient. We define the gradient of \(f\) to be the unique function \(\nabla f\) that satisfies: \[lim_{p \rightarrow 0} \frac{f(x+p) - f(x) - \nabla f(x)^{T}p}{\|p\|} = 0\]...

[Read More]
Tags:
Gradient Descent, Differential Equations, Euler's Method

## Stability of Mapper Graph Invariants

Posted on November 2, 2020

Introduction The Mapper algorithm is a useful tool for identifying patterns in a large dataset by generating a graph summary. We can describe the Mapper algorithm as constructing a discrete approximation of the Reeb graph: Suppose we have a manifold \(\mathbf{X}\) equipped with a distance metric \(d_{\mathbf{X}}\) (such as a...

[Read More]
Tags:
Mapper, TDA, Topological Data Analaysis, Machine Learning

## Compositionality and Functoriality in Machine Learning

Posted on October 2, 2020

Introduction At the heart of Machine Learning is data. In all Machine Learning problems, we use data generated by some process in order to make inferences about that process. In the most general case, we know little to nothing about the data-generating process, and the data itself is just a...

[Read More]
Tags:
Compositionality, Functor, Machine Learning, Category Theory

## PCA vs Laplacian Eigenmaps

Posted on May 9, 2020

At first glance, PCA and Laplacian Eigenmaps seem both very similar. We can view both algorithms as constructing a graph from our data, choosing a matrix to represent this graph, computing the eigenvectors of this matrix, and then using these eigenvectors to determine low-dimensionality embeddings of our data. However, the...

[Read More]
Tags:
Dimensionality Reduction, PCA, Laplacian Eigenmaps

## Learning Complexity and Generalization Bounds

Posted on January 30, 2020

In a typical supervised learning setting, we are given access to a dataset of samples \(S = (X_1, y_1), (X_2, y_2), ..., (X_n, y_n)\) which we assume are drawn from a distribution \(\mathcal{D}\) over \(\textbf{X} \times \textbf{y}\). For simplicity, we will assume that \(\mathbf{X}\) is either the space \(\{0,1\}^n\) or...

[Read More]
Tags:
Learning, Complexity, Generalization, VC Dimension, Vapnik, Chervonenkis, Rademacher