Glossary of AI Terminology

What Is Evaluation Drift?

Evaluation drift

Evaluation drift occurs when an evaluator stops measuring the behavior the team actually cares about. This can happen because user expectations change, policies change, the judge model changes, rubrics become stale, or the system finds ways to pass the eval without improving user outcomes.

Evaluation drift is dangerous because it creates green dashboards and worse products. Periodic human review, judge calibration, and production error analysis help keep evals aligned.

Bi-weekly AI Research Paper Readings

Stay on top of emerging trends and frameworks.