Trace-Level Evals
How to measure LLM application performance at the trace level.
A trace is a complete record of all operations (spans) that occur during a single execution or request in your LLM application. Trace-level evaluations provide a holistic view of:
Whether the overall workflow or task was completed successfully
The quality and correctness of the end-to-end process
Aggregate metrics such as latency, errors, and evaluation labels for the entire trace
The success of multi-step workflows (ex: Agentic reasoning, RAG pipelines)
Trace-Level Evaluations via UI
To run evaluations at the trace level in the UI, set the evaluator scope to “Trace” for each evaluator you want to operate at that level.
You can also apply filters to focus the evaluation on specific parts of a trace. If no filters are applied, the evaluation will consider the entire trace by default.
Last updated
Was this helpful?