- Attach once, evaluate everywhere: Add LLM or built-in code evaluators to a dataset and reuse them across Playground experiments.
- Flexible input mapping: Map evaluator inputs to dataset fields so each example is evaluated consistently.
- Built-in visibility: Each evaluator captures traces for debugging and refinement, with details available from the evaluator view.
Dataset Evaluators
Attach evaluators to datasets for automatic scoring during experiments.
Requires Phoenix 13.x.
Dataset evaluators let you attach evaluators directly to a dataset so they automatically run server-side whenever you execute experiments from the Phoenix UI (for example, from the Playground). This turns your dataset into a reusable evaluation suite and removes the need to reconfigure evaluators for every experiment.
Key capabilities:

