Machine Learning Observability

Your one-stop shop for all things ML observability-related. An overview of ML observability fundamentals: the 4 pillars of ML observability, its implementation in the ML toolchain, and common ML observability techniques.

What is ML Observability?

ML Observability is a tool used to monitor, troubleshoot, and explain machine learning models as they move from research to production environments. An effective observability tool should not only automatically surface issues, but drill down to the root cause of your ML problems and act as a guardrail for models in production.

4 Pillars of ML Observability

ML Observability in practice

Performance Analysis Surfacing worst performing slices

Jump to section →
background gradient

Drift Data distribution changes over lifetime of model

Jump to section →
background gradient

Data Quality Ensure high quality inputs and outputs

Jump to section →
background gradient

Explainability Attribute why a certain outcome was made

Jump to section →
background gradient
Performance Analysis

ML observability enables fast actionable performance information on models deployed in production. While performance analysis techniques vary on a case-by-case basis depending on model type and its use case in the real world, common metrics include: Accuracy, Recall, F-1, MAE, RMSE, and Precision. Performance analysis in an ML observability system ensures that performance has not degraded drastically from when it was trained or when it was initially promoted to production.

Modern Model Performance Management Read More read more
Two Essentials for ML Service-Level Performance Monitoring Read More read more
The Playbook to Monitor Your Model’s Performance in Production Read More read more
Introducing ML Performance Tracing Read More read more
Towards Better Analysis of Machine Learning Models Read More read more
Model Performance Measures Read More read more

ML observability encompasses drift to monitor for a change in distribution over time, measured for model inputs, outputs, and actuals of a model. Measure drift to identify if your models have grown stale, you have data quality issues, or if there are adversarial inputs in your model. Detecting drift in your models will help protect your models from performance degradation and allow you to better understand how to begin resolution.

Using Stastical Distance Metrics in Machine Learning Read More read more
Take My Drift Away Read More read more
The Model’s Shipped; What Could Possibly go Wrong? Read More read more
A Guide To Different Types of Drift Read More read more
Detection of Data Drift and Outliers Affecting ML Models Performance Over Time Read More read more
Model Drift 101 Read More read more
Data Quality

Data quality checks in an ML observability system identify hard failures within data pipelines between training and production that can negatively impact a model’s end performance. Data quality includes monitoring for cardinality shifts, missing data, data type mismatch, out-of-range, and more to better gauge model performance issues and ease RCA.

Bracing Yourself for a World of Data-Centric AI Read More read more
Challenges In Monitoring Production ML Pipelines Read More read more
Solving Data Quality with ML Observability and Data Operations Read More read more
A Quick Start To Data Quality Monitoring for ML Read More read more
The Effect of Data Quality on ML Models Read More read more
The Challenges of Data Quality and Data Quality Assessments Read More read more

Explainability in ML observability uncovers feature importance across training, validation, and production environments which provides the ability to introspect and understand why a model made a particular prediction. Explainability is commonly achieved by calculating metrics such as SHAP and LIME to build confidence and continuously improve machine-learned models.

What Are Global, Cohort, and Local Model Explainability? Read More read more
What Are the Prevailing Explainability Methods and Where Should You Use Them? Read More read more
The Only 3 ML Tools You Need Read More read more
A Survey on Explainable Artificial Intelligence Read More read more
Using Model Explainability With Arize Read More read more
Stochastic Backpropagation and Approximate Inference in Deep Generative Models Read More read more

Sign up for our monthly newsletter, The Drift.


Subscribe to the Arize blog

Get the latest news, expertise, and product updates from Arize.

close icon