> ## Documentation Index
> Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Setting up tracing for agent experiments

> Link the spans your agent emits to the experiment run that called it, using W3C trace context propagation and per-request space routing.

When you run an agent experiment, Arize creates a top-level span for each experiment run and sends a W3C `traceparent` header with the request. If your agent extracts that header and uses it as the parent context for its own spans, all the agent's tracing — LLM calls, tool calls, retrieval steps — nests under the experiment-run trace in the Arize UI.

This page covers what your agent needs to do.

## What you need to do

Two things, in order:

1. **Extract `traceparent` from the incoming request** and use it as the parent context for the top-level span you create.
2. **Use `register_with_routing()` and `set_routing_context()`** so spans land in the right space and project per-request, instead of the static one configured at app startup.

If you skip step 1, your agent's spans appear in a separate, orphan trace. If you skip step 2, every request's spans go to whatever space your env vars point at — even when Arize calls from a different space.

## How Arize propagates context

Every agent-experiment request includes these headers:

```
traceparent: 00-<32-char-trace-id>-<16-char-span-id>-01
tracestate: <vendor-state>
baggage: arize.space_id=<space>,arize.project_name=<project>
```

* **`traceparent`** is W3C standard — links your spans to a specific Arize-side parent.
* **`baggage`** carries the space\_id and project\_name Arize wants your spans routed to. Both fields are also available in the request body as `arize_metadata.space_id` and `arize_metadata.project_name`.

## Extracting the parent context

OpenTelemetry's `propagate.extract()` handles this in one line. Pass the incoming request headers, get back a `Context` you attach before creating your top-level span.

<CodeGroup>
  ```python FastAPI theme={null}
  from fastapi import FastAPI, Request
  from opentelemetry.context import attach, detach
  from opentelemetry.propagate import extract
  from opentelemetry.trace import get_tracer

  tracer = get_tracer("my-agent", "1.0.0")
  app = FastAPI()

  @app.post("/invoke")
  async def invoke(request: Request, body: dict):
      # 1. Extract W3C trace context from the incoming headers.
      parent_ctx = extract(dict(request.headers))

      # 2. Attach it as the current OTel context.
      token = attach(parent_ctx)
      try:
          # 3. Start your CHAIN span. Because parent_ctx is now current,
          #    this span automatically becomes a child of the Arize-side
          #    experiment-run trace.
          with tracer.start_as_current_span("run_agent") as chain_span:
              chain_span.set_attribute("openinference.span.kind", "CHAIN")
              chain_span.set_attribute("input.value", body.get("input", {}).get("goal", ""))

              # ... your agent code, LLM calls, tool spans ...
              result = await my_agent_logic(body)

              chain_span.set_attribute("output.value", result.text)
              return result
      finally:
          detach(token)
  ```

  ```typescript Express theme={null}
  import express from "express";
  import { context, propagation, trace } from "@opentelemetry/api";

  const tracer = trace.getTracer("my-agent", "1.0.0");
  const app = express();
  app.use(express.json());

  app.post("/invoke", async (req, res) => {
    // 1. Extract W3C trace context from the incoming headers.
    const parentCtx = propagation.extract(context.active(), req.headers);

    // 2. Run the rest inside that context.
    await context.with(parentCtx, async () => {
      const span = tracer.startSpan("run_agent");
      span.setAttribute("openinference.span.kind", "CHAIN");
      span.setAttribute("input.value", req.body.input?.goal ?? "");

      try {
        // ... your agent code, LLM calls, tool spans ...
        const result = await myAgentLogic(req.body);

        span.setAttribute("output.value", result.text);
        res.json(result);
      } finally {
        span.end();
      }
    });
  });

  app.listen(8000);
  ```
</CodeGroup>

<Callout type="warning">
  **Don't pass `context=parent_ctx` directly to `start_as_current_span()`** if you're also using `set_routing_context()` (next section). Doing so overrides the routing context, and Arize will drop the span. Always `attach()` first, layer routing on top, then start the span with no explicit `context` override.
</Callout>

## Per-request space and project routing

The standard `arize.otel.register()` call locks your TracerProvider to a single space and project at app startup. That's wrong for an agent endpoint that one space's experiment runner might call, then another space's might call a minute later.

Use `register_with_routing()` instead — it leaves routing unset at startup, and you decide per-request via `set_routing_context()`:

```python theme={null}
import os
from arize.otel import register_with_routing, set_routing_context
from openinference.instrumentation.anthropic import AnthropicInstrumentor

# At app startup — no fixed space_id / project_name.
tracer_provider = register_with_routing(api_key=os.environ["ARIZE_API_KEY"])
AnthropicInstrumentor().instrument(tracer_provider=tracer_provider)
```

Then in your `/invoke` handler, resolve the space and project from the request, and wrap the span creation:

```python theme={null}
async def invoke(request, body):
    parent_ctx = extract(dict(request.headers))

    # Pull routing info from arize_metadata (preferred) or baggage header.
    md = body.get("arize_metadata", {})
    space_id = md.get("space_id")
    project_name = md.get("project_name", "my-agent")

    token = attach(parent_ctx)
    try:
        # Layer routing on top of the parent context.
        with set_routing_context(space_id=space_id, project_name=project_name):
            with tracer.start_as_current_span("run_agent") as chain_span:
                chain_span.set_attribute("openinference.span.kind", "CHAIN")
                # ... rest of your agent ...
    finally:
        detach(token)
```

<Callout type="info">
  Both `space_id` and `project_name` must be set inside `set_routing_context()`, or `arize-otel` drops the span entirely. If either is missing in `arize_metadata`, fall back to env defaults or use the request's baggage header — but never call `set_routing_context()` with `None`.
</Callout>

## What the final trace looks like

When both pieces are in place, the trace tree in Arize looks like:

```
agent.experiment.run                      ← Arize-side parent span
  └─ run_agent  [CHAIN]                   ← your top-level span
       ├─ openai.chat.completion          ← auto-instrumented LLM call
       ├─ search_flights  [TOOL]          ← your tool span
       │    └─ http_request               ← auto-instrumented HTTP call
       ├─ openai.chat.completion          ← next LLM turn
       ├─ search_hotels  [TOOL]
       └─ propose_itinerary  [TOOL]
```

All spans share one trace\_id, and the experiment-run UI in Arize lets you jump directly into this trace.

## Subprocess agents (Claude Agent SDK, OpenAI Agents SDK CLI mode, etc.)

If your agent runs the LLM calls in a **subprocess** (the Claude Agent SDK's bundled CLI, for example), the auto-instrumentor in your parent Python process won't see those API calls — they happen in another process.

Two options:

* **Accept it.** Your CHAIN + TOOL spans still capture the orchestration layer, which is usually what you care about for experiments. Per-turn LLM details just won't be present.
* **Switch to in-process LLM calls.** Replace the SDK with direct `anthropic` / `openai` SDK calls and your own loop. You lose the SDK's harness, but full LLM tracing works.

If you go with the first option, you can still see *what* the agent did via your manual TOOL spans, and the run-level latency/result is captured at the CHAIN span level.

## Where routing values come from

Arize sends the same routing information in three places. Use whichever fits your code:

| Source         | Where                                                    | Notes                                          |
| -------------- | -------------------------------------------------------- | ---------------------------------------------- |
| Request body   | `arize_metadata.space_id`, `arize_metadata.project_name` | Easiest — parse the JSON body.                 |
| Baggage header | `baggage: arize.space_id=...,arize.project_name=...`     | Use OTel's baggage API to read.                |
| Custom headers | `x-arize-experiment-id`, `x-arize-experiment-run-id`     | For experiment/run linkage, not space routing. |

In code, check `arize_metadata` first, then fall back to baggage, then fall back to env defaults. The example above shows the recommended priority order.

## Validating context propagation

After deploying your traced agent, run one experiment row and check both sides:

1. In the **Experiments** view, open the run and copy the `trace_id` from the experiment span.
2. In the **Traces** view, search for that `trace_id`. You should see both `agent.experiment.run` (from Arize) and `run_agent` (from your agent), with `run_agent`'s parent\_id pointing at `agent.experiment.run`.
3. Tool spans should be children of `run_agent`. If they appear as direct children of `agent.experiment.run` instead, your CHAIN span isn't getting exported — usually a routing context issue (see warning above).

## Common pitfalls

<AccordionGroup>
  <Accordion title="Spans appear with a different trace_id than the experiment run">
    You're not extracting `traceparent` from the incoming request. Add the `propagate.extract()` step shown above.
  </Accordion>

  <Accordion title="CHAIN span shows up but tool spans don't (or vice versa)">
    You're starting spans both inside and outside the `set_routing_context()` block. Move all span creation inside it.
  </Accordion>

  <Accordion title="The CHAIN span is missing, leaving tool spans 'orphaned'">
    You probably passed `context=parent_ctx` directly to `start_as_current_span()` *and* used `set_routing_context()`. The explicit `context=` override drops the routing attributes from the new span, and arize-otel filters it out. Fix: use `attach()` to make `parent_ctx` current first, then start the span normally.
  </Accordion>

  <Accordion title="`Failed to export traces, StatusCode.UNAVAILABLE`">
    Your `ARIZE_OTLP_ENDPOINT` is wrong or unreachable. Verify the URL includes scheme and path (e.g. `https://otlp.arize.com/v1`). For self-hosted Arize, point at your in-cluster collector.
  </Accordion>
</AccordionGroup>

## Next

<Card title="Running agent experiments" href="/ax/improve/run-agent-experiments">
  Pick a dataset, launch a run, and compare across config variants.
</Card>
