Data problems in machine learning come in a wide variety, from sudden data pipeline failures to feature drift over time. Statistical distance measures give teams an indication of changes in the data affecting a model and insights for troubleshooting, helping teams get in front of problems before they impact business outcomes.
In this white paper, you’ll learn:
- How and where to use common statistical distance checks in machine learning
- Use cases for statistical distance checks across model inputs, model outputs and actuals.
- When and how to use specific statistical distance measures — including population stability index (PSI), Kullback–Leibler divergence (KL-Divergence), Jensen–Shannon divergence (JS-Divergence) and Earth Mover’s Distance (EMD)
- Type of bins and how to ensure statistical distance check metrics make sense in real-world applications
Download this guide on statistical distance checks now.