Your automated evals say a response is “grounded” — but is it really? Sometimes you need a human to weigh in. Annotations let your team add ground-truth labels and scores directly on spans, building a feedback loop between humans and your AI.Documentation Index
Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
How to do it
- Open a trace and click into any span
- Click the Annotate toggle in the span toolbar
- Select an annotation config (e.g., “Correctness”, “Helpfulness”) or create a new one
- Add your label or score — saves automatically
Annotation configs
Configs define the schema for your labels. Shared across the project so everyone uses the same schema.- Categorical — fixed labels (e.g., “correct”, “incorrect”, “partially correct”)
- Continuous — numeric scores on a range (e.g., 1–5)
Annotation notes
In addition to labels and scores, you can attach free-text notes to any annotation. Notes are useful for explaining edge cases, providing context for disagreements, or flagging spans for follow-up discussion.Measure eval quality with annotations
Use annotations as ground truth to measure how well your automated evals perform:Annotations vs. evals
| Annotations | Evals | |
|---|---|---|
| Who | Humans | Automated (LLM-as-judge or code) |
| Scale | Small samples | Every span |
| Best for | Ground truth, calibration | Production monitoring |