Glossary of AI Terminology

What Is A Trace?

Trace

A trace is the complete execution record of what an AI system actually did while handling a request.

In traditional software, engineers could often understand system behavior by reading the code. The code defined the execution path ahead of time, and most behavior was deterministic and predictable before the system ran.

AI systems changed that model.

With LLMs and agents, important decisions happen dynamically at runtime: what tool to call, what context to retrieve, how to route a request, whether to retry, how to plan the next step, and how to synthesize a final response. You cannot fully understand this behavior from source code alone because the behavior emerges during execution.

That makes traces the new source of truth.

A trace captures the full sequence of operations that occurred during a workflow, including:

  • model calls
  • retrieval steps
  • tool invocations
  • routing decisions
  • agent actions
  • retries and failures
  • intermediate outputs
  • final responses

Traces are typically organized as spans, where each span represents a single operation inside the workflow. Together, these spans form a timeline of the system’s behavior.

In modern AI systems, traces are the foundation for observability, debugging, and evaluation. Source code tells you what the system could do while traces tell you what it actually did.

That distinction matters because failures in agentic systems often come from runtime behavior rather than static logic alone. A retrieval step may return the wrong document. An agent may choose the wrong tool. A router may send a request down the wrong path. These failures only become visible through traces.

Traces are what flow into evaluation pipelines, what monitoring systems inspect, and what engineers and reviewers analyze when diagnosing failures or improving system behavior over time.

Bi-weekly AI Research Paper Readings

Stay on top of emerging trends and frameworks.