Embedding representations are proliferating in Modern ML systems. Embeddings capture the internal structure of BERT/word-2-vec, they are the underpinnings of graph neural networks, they represent structure learned in image models and they represent the core idea behind many recommendation systems. High dimensional representations are everywhere in modern production ML. High dimensional representations are notoriously hard to troubleshoot, and visualize. This talk will cover the cutting edge of embedding visualization, how UMAP can be used in practice to troubleshoot high dimensional structures, how UMAP has evolved since its original release and the future of lower dimensional topological analysis of high dimensional data.
Speakers
Leland McInnes
Founder, UMAP
Leland McInnes is a researcher at the Tutte Institute for Mathematics and Computing working on topologically motivated methods in data science. He balances his time between theoretical research, software engineering and implementation, and domain specific problems.
Chris Moody
Data Platform Engineer, Stitch Fix
I've lead machine learning teams at Stitch Fix (specializing in 3D modeling, NLP & recommender systems), building prototype apps and indiehacking. I've given talks & tutorials on building deep-learning based recommenders in PyTorch, while also developing NLP techniques like LDA2vec and 3d mesh-fitting techniques.
Francisco Castillo
Data Scientist, Arize AI