Marquee background
ML Observability icon

Machine Learning Observability

Your one-stop shop for all things ML observability-related. An overview of ML observability fundamentals: the 4 pillars of ML observability, its implementation in the ML toolchain, and common ML observability techniques.

What is ML Observability?

ML Observability is a tool used to monitor, troubleshoot, and explain machine learning models as they move from research to production environments. An effective observability tool should not only automatically surface issues, but drill down to the root cause of your ML problems and act as a guardrail for models in production.

4 Pillars of ML Observability

ML Observability in practice

Performance Analysis Surfacing worst performing slices

Jump to section →
background gradient

Drift Data distribution changes over lifetime of model

Jump to section →
background gradient

Data Quality Ensure high quality inputs and outputs

Jump to section →
background gradient

Explainability Attribute why a certain outcome was made

Jump to section →
background gradient
Performance Analysis

ML observability enables fast actionable performance information on models deployed in production. While performance analysis techniques vary on a case-by-case basis depending on model type and its use case in the real world, common metrics include: Accuracy, Recall, F-1, MAE, RMSE, and Precision. Performance analysis in an ML observability system ensures that performance has not degraded drastically from when it was trained or when it was initially promoted to production.


ML observability encompasses drift to monitor for a change in distribution over time, measured for model inputs, outputs, and actuals of a model. Measure drift to identify if your models have grown stale, you have data quality issues, or if there are adversarial inputs in your model. Detecting drift in your models will help protect your models from performance degradation and allow you to better understand how to begin resolution.

Data Quality

Data quality checks in an ML observability system identify hard failures within data pipelines between training and production that can negatively impact a model’s end performance. Data quality includes monitoring for cardinality shifts, missing data, data type mismatch, out-of-range, and more to better gauge model performance issues and ease RCA.


Explainability in ML observability uncovers feature importance across training, validation, and production environments which provides the ability to introspect and understand why a model made a particular prediction. Explainability is commonly achieved by calculating metrics such as SHAP and LIME to build confidence and continuously improve machine-learned models.

Sign up for our monthly newsletter, The Drift.