You need a test dataset, but making one from scratch is painful. Your best examples are already in your production traces — the edge cases your users actually hit, the queries that tripped up your model, the responses that were perfect. Arize AX lets you turn those into a dataset directly, no pipeline needed.Documentation Index
Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
How to do it
Individual: Open a span → click Add to → Dataset → choose or create a dataset. Bulk: Select multiple spans in the traces table → Add to → Dataset. Using Alyx: “Create a dataset from the filtered spans” or “Add these error traces to my test dataset” [screenshot: add to dataset dialog]Common workflows
| Workflow | How |
|---|---|
| Test set from production | Filter to edge cases → Add to Dataset → Run experiments |
| Few-shot examples | Find high-quality responses → Add to Dataset → Reference in prompts |
| Fine-tuning data | Filter for correct responses → Add to Dataset → Export |
| Human-in-the-loop | Labeling Queue → Annotate → Create Dataset |