Test prompts on spans (replay)

Any span on the LLM Tracing page can be loaded directly into the Prompt Playground for replay and iteration. All LLM parameters, prompt templates, input messages, variables, and function definitions are automatically populated, so you can replicate the exact call without manual setup. This allows you to iterate quickly, test changes in context, and continuously refine prompts.

Here, you can

Load different prompts from the Prompt Hub
Play around with different LLMs
Tune invocation parameters
Add new messages/functions to your prompt

Why Replay Spans?

Span replay bridges production traces and prompt iteration, allowing you to debug and improve prompts in their real execution context. Instead of guessing what caused a response, you can load the exact span, with its inputs, variables, and function calls, directly into the Prompt Playground and experiment freely. With span replay, you can:

Debug in context: Reproduce the precise LLM invocation that occurred in production, including all inputs and configurations.
Iterate instantly: Adjust prompts, parameters, or models and re-run them on the same real data, no manual reconstruction needed.
Validate improvements: Compare responses side by side and ensure changes lead to measurable quality gains.
Accelerate experimentation: Turn traces into actionable testing environments for faster, data-driven prompt refinement.

In short, span replay transforms tracing data into an interactive feedback loop — connecting observation (what happened) with optimization (how to make it better).

Alyx

Observe

Evaluate

Develop

Prompts

Machine Learning

Security & Settings

Test prompts on spans (replay)

Why Replay Spans?

Alyx

Observe

Evaluate

Develop

Prompts

Machine Learning

Security & Settings

​Why Replay Spans?

Why Replay Spans?