Setting up tracing for agent experiments

When you run an agent experiment, Arize creates a top-level span for each experiment run and sends a W3C traceparent header with the request. If your agent extracts that header and uses it as the parent context for its own spans, all the agent’s tracing — LLM calls, tool calls, retrieval steps — nests under the experiment-run trace in the Arize UI. This page covers what your agent needs to do.

What you need to do

Two things, in order:

Extract traceparent from the incoming request and use it as the parent context for the top-level span you create.
Use register_with_routing() and set_routing_context() so spans land in the right space and project per-request, instead of the static one configured at app startup.

If you skip step 1, your agent’s spans appear in a separate, orphan trace. If you skip step 2, every request’s spans go to whatever space your env vars point at — even when Arize calls from a different space.

How Arize propagates context

Every agent-experiment request includes these headers:

traceparent: 00-<32-char-trace-id>-<16-char-span-id>-01
tracestate: <vendor-state>
baggage: arize.space_id=<space>,arize.project_name=<project>

traceparent is W3C standard — links your spans to a specific Arize-side parent.
baggage carries the space_id and project_name Arize wants your spans routed to. Both fields are also available in the request body as arize_metadata.space_id and arize_metadata.project_name.

Extracting the parent context

OpenTelemetry’s propagate.extract() handles this in one line. Pass the incoming request headers, get back a Context you attach before creating your top-level span.

from fastapi import FastAPI, Request
from opentelemetry.context import attach, detach
from opentelemetry.propagate import extract
from opentelemetry.trace import get_tracer

tracer = get_tracer("my-agent", "1.0.0")
app = FastAPI()

@app.post("/invoke")
async def invoke(request: Request, body: dict):
    # 1. Extract W3C trace context from the incoming headers.
    parent_ctx = extract(dict(request.headers))

    # 2. Attach it as the current OTel context.
    token = attach(parent_ctx)
    try:
        # 3. Start your CHAIN span. Because parent_ctx is now current,
        #    this span automatically becomes a child of the Arize-side
        #    experiment-run trace.
        with tracer.start_as_current_span("run_agent") as chain_span:
            chain_span.set_attribute("openinference.span.kind", "CHAIN")
            chain_span.set_attribute("input.value", body.get("input", {}).get("goal", ""))

            # ... your agent code, LLM calls, tool spans ...
            result = await my_agent_logic(body)

            chain_span.set_attribute("output.value", result.text)
            return result
    finally:
        detach(token)

import express from "express";
import { context, propagation, trace } from "@opentelemetry/api";

const tracer = trace.getTracer("my-agent", "1.0.0");
const app = express();
app.use(express.json());

app.post("/invoke", async (req, res) => {
  // 1. Extract W3C trace context from the incoming headers.
  const parentCtx = propagation.extract(context.active(), req.headers);

  // 2. Run the rest inside that context.
  await context.with(parentCtx, async () => {
    const span = tracer.startSpan("run_agent");
    span.setAttribute("openinference.span.kind", "CHAIN");
    span.setAttribute("input.value", req.body.input?.goal ?? "");

    try {
      // ... your agent code, LLM calls, tool spans ...
      const result = await myAgentLogic(req.body);

      span.setAttribute("output.value", result.text);
      res.json(result);
    } finally {
      span.end();
    }
  });
});

app.listen(8000);

Don’t pass context=parent_ctx directly to start_as_current_span() if you’re also using set_routing_context() (next section). Doing so overrides the routing context, and Arize will drop the span. Always attach() first, layer routing on top, then start the span with no explicit context override.

Per-request space and project routing

The standard arize.otel.register() call locks your TracerProvider to a single space and project at app startup. That’s wrong for an agent endpoint that one space’s experiment runner might call, then another space’s might call a minute later. Use register_with_routing() instead — it leaves routing unset at startup, and you decide per-request via set_routing_context():

import os
from arize.otel import register_with_routing, set_routing_context
from openinference.instrumentation.anthropic import AnthropicInstrumentor

# At app startup — no fixed space_id / project_name.
tracer_provider = register_with_routing(api_key=os.environ["ARIZE_API_KEY"])
AnthropicInstrumentor().instrument(tracer_provider=tracer_provider)

Then in your /invoke handler, resolve the space and project from the request, and wrap the span creation:

async def invoke(request, body):
    parent_ctx = extract(dict(request.headers))

    # Pull routing info from arize_metadata (preferred) or baggage header.
    md = body.get("arize_metadata", {})
    space_id = md.get("space_id")
    project_name = md.get("project_name", "my-agent")

    token = attach(parent_ctx)
    try:
        # Layer routing on top of the parent context.
        with set_routing_context(space_id=space_id, project_name=project_name):
            with tracer.start_as_current_span("run_agent") as chain_span:
                chain_span.set_attribute("openinference.span.kind", "CHAIN")
                # ... rest of your agent ...
    finally:
        detach(token)

Both space_id and project_name must be set inside set_routing_context(), or arize-otel drops the span entirely. If either is missing in arize_metadata, fall back to env defaults or use the request’s baggage header — but never call set_routing_context() with None.

What the final trace looks like

When both pieces are in place, the trace tree in Arize looks like:

agent.experiment.run                      ← Arize-side parent span
  └─ run_agent  [CHAIN]                   ← your top-level span
       ├─ openai.chat.completion          ← auto-instrumented LLM call
       ├─ search_flights  [TOOL]          ← your tool span
       │    └─ http_request               ← auto-instrumented HTTP call
       ├─ openai.chat.completion          ← next LLM turn
       ├─ search_hotels  [TOOL]
       └─ propose_itinerary  [TOOL]

All spans share one trace_id, and the experiment-run UI in Arize lets you jump directly into this trace.

Subprocess agents (Claude Agent SDK, OpenAI Agents SDK CLI mode, etc.)

If your agent runs the LLM calls in a subprocess (the Claude Agent SDK’s bundled CLI, for example), the auto-instrumentor in your parent Python process won’t see those API calls — they happen in another process. Two options:

Accept it. Your CHAIN + TOOL spans still capture the orchestration layer, which is usually what you care about for experiments. Per-turn LLM details just won’t be present.
Switch to in-process LLM calls. Replace the SDK with direct anthropic / openai SDK calls and your own loop. You lose the SDK’s harness, but full LLM tracing works.

If you go with the first option, you can still see what the agent did via your manual TOOL spans, and the run-level latency/result is captured at the CHAIN span level.

Where routing values come from

Arize sends the same routing information in three places. Use whichever fits your code:

Source	Where	Notes
Request body	`arize_metadata.space_id`, `arize_metadata.project_name`	Easiest — parse the JSON body.
Baggage header	`baggage: arize.space_id=...,arize.project_name=...`	Use OTel’s baggage API to read.
Custom headers	`x-arize-experiment-id`, `x-arize-experiment-run-id`	For experiment/run linkage, not space routing.

In code, check arize_metadata first, then fall back to baggage, then fall back to env defaults. The example above shows the recommended priority order.

Validating context propagation

After deploying your traced agent, run one experiment row and check both sides:

In the Experiments view, open the run and copy the trace_id from the experiment span.
In the Traces view, search for that trace_id. You should see both agent.experiment.run (from Arize) and run_agent (from your agent), with run_agent’s parent_id pointing at agent.experiment.run.
Tool spans should be children of run_agent. If they appear as direct children of agent.experiment.run instead, your CHAIN span isn’t getting exported — usually a routing context issue (see warning above).

Common pitfalls

Spans appear with a different trace_id than the experiment run

You’re not extracting traceparent from the incoming request. Add the propagate.extract() step shown above.

CHAIN span shows up but tool spans don't (or vice versa)

You’re starting spans both inside and outside the set_routing_context() block. Move all span creation inside it.

The CHAIN span is missing, leaving tool spans 'orphaned'

You probably passed context=parent_ctx directly to start_as_current_span() and used set_routing_context(). The explicit context= override drops the routing attributes from the new span, and arize-otel filters it out. Fix: use attach() to make parent_ctx current first, then start the span normally.

`Failed to export traces, StatusCode.UNAVAILABLE`

Your ARIZE_OTLP_ENDPOINT is wrong or unreachable. Verify the URL includes scheme and path (e.g. https://otlp.arize.com/v1). For self-hosted Arize, point at your in-cluster collector.

Running agent experiments

Pick a dataset, launch a run, and compare across config variants.

Quickstart

Instrument

Observe

Evaluate

Improve

Agents

Machine Learning

Settings

Security

Setting up tracing for agent experiments

What you need to do

How Arize propagates context

Extracting the parent context

Per-request space and project routing

What the final trace looks like

Subprocess agents (Claude Agent SDK, OpenAI Agents SDK CLI mode, etc.)

Where routing values come from

Validating context propagation

Common pitfalls

Next

Running agent experiments

​What you need to do

​How Arize propagates context

​Extracting the parent context

​Per-request space and project routing

​What the final trace looks like

​Subprocess agents (Claude Agent SDK, OpenAI Agents SDK CLI mode, etc.)

​Where routing values come from

​Validating context propagation

​Common pitfalls

​Next

Running agent experiments

What you need to do

How Arize propagates context

Extracting the parent context

Per-request space and project routing

What the final trace looks like

Subprocess agents (Claude Agent SDK, OpenAI Agents SDK CLI mode, etc.)

Where routing values come from

Validating context propagation

Common pitfalls

Next