Dan Shiebler

My Thoughts on KDD 2018

Posted on August 31, 2018

Last week I was at KDD 2018 in London. This was my first time at KDD, and I had the opportunity to present our paper on embeddings at the Common Model Infrastructure workshop. I was really impressed by both the workshops and the main program, and I thought I’d share... [Read More]

Tags: KDD, Machine Learning, ML, Data, Conference

Representing Graphs with Low Dimensional Matrix Factorization for Fun and Profit

Posted on March 30, 2018

A solid laptop computer in 2018 has about 1TB (1000GB) of disk space, and the capability to store about 16GB of memory in RAM. In comparison, internet users in the United States generate about 3000TB of data every minute 1. An enormous amount of this data takes the form of... [Read More]

Tags: Embeddings, Matrix, Factorization, Graph, Recommendation, Word2Vec

Don't trust data too much

Posted on October 29, 2017

Introduction There’s a famous scene in the HBO show “The Wire” where the unscrupulous Deputy Commissioner Rawls is addressing the police colonels and majors, and he says: Gentlemen, the word from on high is that the felony rates district by district will decline by five percent before the end of... [Read More]

Tags: Data, Statistics, Lying, Probability, p-value, Misuse, Trust

R and R^2, the relationship between correlation and the coefficient of determination.

Posted on June 25, 2017

There are 2 closely related quantities in statistics - correlation (often referred to as \(R\)) and the coefficient of determination (often referred to as \(R^2\)). Today we’ll explore the nature of the relationship between \(R\) and \(R^2\), go over some common use cases for each statistic and address some misconceptions.... [Read More]

Tags: Correlation, R, R2, R^2, Coefficient of Determination, Regression, Performance

Understanding Neural Networks with Layerwise Relevance Propagation and Deep Taylor Series

Posted on April 16, 2017

Deep neural networks are some of the most powerful learning algorithms that have ever been developed. Unfortunately, they are also some of the most complex. The hierarchical non-linear transformations that neural networks apply to data can be nearly impossible to understand. This problem is exacerbated by the non-determinism of neural... [Read More]

Tags: Tensorflow, Layerwise, Relevance, Propagation, Deep, Taylor, Series, Visualization