ML Monitoring Overview

Your one-stop-shop for all things machine learning (ML) monitoring related. Learn how to identify model performance problems, data/prediction drift, and data quality issues for your models in production.

What is ML Monitoring?

ML Monitoring is a series of techniques that are deployed to better measure key model performance metrics and understand when issues arise in machine learning models. Areas of focus include: model drift, model performance, model outliers and data quality. ML monitoring is a subset of ML observability. While ML monitoring consists of setting up alerts on key model performance metrics such as accuracy, or drift, model observability implies a higher objective of getting to the bottom of any regressions in performance or anomalous behavior by connecting points across validation and production.

Model Drift

Measuring model drift is a critical component of an ML monitoring system. Drift is a change in distribution over time, measured for model inputs, outputs, and actuals of a model. Monitor for drift to identify if your models have grown stale, you have data quality issues, or if there are adversarial inputs in your model. Detecting drift with ML monitoring will help protect your models from performance degradation and allow you to better understand how to begin resolution.

Monitoring for Model Drift

Model drift is a key component of ML Monitoring. Dive into what drift is, why it’s important to keep track of, and how to troubleshoot and resolve the underlying issue when drift occurs.

Read More
A Data System For Monitoring and Improving Machine Learning Systems

Learn how to automate the life cycle of model construction, deployment, and ML monitoring by providing a set of novel high-level, declarative abstractions.

Read More
Hidden Technical Debt In Machine Learning Systems

Explore machine learning specific risk factors with ML monitoring such as boundary erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies, and more.

Read More
What’s Your ML Test Score?

An ML Test Score Rubric to quantify common real-world ML production issues.

Read More
Applied Machine Learning at Facebook

An overview of the hardware and software infrastructure that support Facebook's machine learning initiatives at scale.

Read More
Scaling Machine Learning as a Service

An outline of Uber's Machine Learning as a Service (MLaaS) as it operates globally and its scalability challenges.

Read More

Model Performance

Model performance metrics measure how well your model performs in production utilizing Accuracy, Recall, Precision, F1, MAE, or MAPE. While measuring your model's performance is not one size fits all, and varies case by case, correctly measuring model your performance is essential to ensure you are shipping a consistent and effective product to your customers.

The Model Has Shipped; What Could Possibly Go Wrong?

Examples of potential failure modes along with the most common symptoms that they exhibit in your production model’s performance that can be relieved with ML monitoring.

Read More
Monitor Your Models In Production

ML monitoring for production models gives an overview of common challenges connected to the availability of ground truth and available performance metrics to measure models in each scenario.

Read More
Underspecification Presents Challenges for Credibility in Modern Machine Learning

Specifications on the need to explicitly account for underspecification in modeling pipelines that are intended for real-world deployment in any domain.

Read More
Software Engineering For Machine Learning

An outline of Microsoft's emerging 9 step workflow for machine learning, its best practices, and how the AI development process differs from software.

Read More
Machine Learning Testing: Survey, Landscapes, and Horizons

A survey of testing machine learning systems on testing properties, components, workflow, and various application scenarios.

Read More
The ML Test Score: A Rubric for ML Production Readiness and Technical Dept Reduction

28 specific tests and monitoring needs from a wide range of production ML systems to quantify common issues, provide a road-map, improve production readiness, and pay down technical dept

Read More

Model Outliers

Detecting model outliers is a powerful feature of ML monitoring. Model outlier monitoring requires a multivariate analysis across all of the input features to find individual predictions that could be outliers in a model. Utilize ML monitoring tools to discover cohorts of anomalous examples and predictions. To protect against unwanted inputs, you can employ an unsupervised learning method to categorize model inputs and predictions, allowing you to discover cohorts of anomalous examples and predictions to safeguard your model performance over time.

Outlier Detection in High Dimensional Data

A novel outlier detection algorithm based on principal component analysis and kernel density estimation to deal with the challenges of high dimensional data.

Read More
Improving the Accuracy of Convolutional Neural Networks by Identifying and Removing Outlier Images in Datasets Using t-SNE

A multistage method for detecting and removing outliers in high-dimensional data based on a technique called t-distributed stochastic neighbor embedding (t-SNE) to reduce high-dimensional map of features into a lower, two-dimensional, probability density distribution

Read More
Anomaly Detection Using Isolation Forest in Python

An overview of anomaly detection, its use cases, and the definition of an isolation forest and its applications for anomaly detection.

Read More
Use of Machine Learning For Anomaly Detection Problem in Large Astronomical Databases

The common problems associated with anomaly detection in large astronomical databases by machine learning methods.

Read More

Data Quality

Model performance is highly dependent on the quality of the data sources powering your model’s features; use ML monitoring to identify cardinality shifts, data type mismatch, missing data, and more to improve data quality. By monitoring for data quality metrics, you can improve your ML systems by ensuring that your model is trained on high-quality data, ensuring your data remains that way throughout your model's life in production.

Data Quality Monitoring

An introduction into why your team should be paying close attention to the quality of your data and the impact on your model’s end performance

Read More
Everyone wants to do the Model Work, Not The Data Work

A report on data practices in high-stakes AI applications that define, identify, and present empirical evidence on Data Cascades and its impacts on AI/ML practices.

Read More
Challenges in Deploying Machine Learning

The challenges that practitioners face at each stage of the ML lifecycle via survey analysis to lay out a research agenda for lifecycle improvement.

Read More
Data Validation for Machine Learning

A data validation system that is designed to detect anomalies specifically in data fed into machine learning pipelines, which is currently deployed in an end-to-end ml platform at Google.

Read More
Beyond Accuracy: Behavioral Testing of NLP Models

An overview of a task agnostic methodology for testing NLP models which includes a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation.

Read More

Beyond Monitoring - ML Observability

Continually improving your models does not stop at ML monitoring; drill down to truly understand your models with ML Observability that encompasses ML monitoring, validation, and troubleshooting to improve your model performance and enhance your AI ROI. Empower your teams with ML Observability to automatically detect model issues, diagnose hard-to-find problems, and improve your models performance in production.

Beyond Monitoring: ML Observability

An overview of going beyond monitoring to allow data and ML engineering teams to understand the health of their data-driven systems in a more holistic manner.

Read More
What is ML Observability?

Observability is interested in what we can infer from the model’s predictions, explainability insights, the production feature data, and the training data, to understand the cause behind model actions and build workflows to improve.

Read More

Subscribe to our resources and blogs

Subscribe