Skip to main content
Agent experiments call a POST endpoint that you host. Your agent can be built on any framework — Arize doesn’t see it directly, only the requests and responses. This page covers what your endpoint needs to do and how to register it in Arize.

Endpoint requirements

Your agent must expose an HTTP endpoint that:
  1. Accepts POST requests with Content-Type: application/json.
  2. Reads the request body as JSON.
  3. Returns a JSON response body (any shape — Arize stores it verbatim).
  4. Is reachable from Arize’s coordinator over the public internet (or a VPC peering setup, for self-hosted deployments).
There are no requirements on response shape, status codes beyond 200 (failures are recorded with their error), or response time below the configured timeout.

Request body shape

Arize wraps the body you template into an envelope:
{
  "input": {
    "goal": "Plan a 3-day trip to Tokyo",
    "config": { "model": "claude-sonnet-4-5", "max_turns": 12 }
  },
  "arize_metadata": {
    "dataset_id": "abc...",
    "experiment_id": "exp...",
    "run_id": "run...",
    "example_id": "ex...",
    "space_id": "sp..."
  }
}
  • input — the body you templated in the agent configuration, hydrated with the current dataset row.
  • arize_metadata — appended by Arize on every request. Use these IDs to stamp spans your agent produces, so traces link back to this run. See Setting up tracing.
Your agent reads from input and ignores arize_metadata (or uses it for trace correlation).

Headers Arize sends

HeaderPurpose
Content-Type: application/jsonBody encoding.
Authorization (optional)Bearer token or custom auth headers you configured.
traceparentW3C trace context, links your agent’s spans to the experiment run.
tracestateW3C trace state.
baggage (optional)Routing baggage for OpenTelemetry context propagation.

A minimal Python agent endpoint

Here’s a complete FastAPI example that accepts the Arize envelope, runs your agent code, and returns a response:
import os
from fastapi import FastAPI, HTTPException, Request
from pydantic import BaseModel, ConfigDict

app = FastAPI()

class InvokeRequest(BaseModel):
    model_config = ConfigDict(extra="allow")

    input: dict | None = None
    arize_metadata: dict | None = None

@app.post("/invoke")
async def invoke(req: InvokeRequest, request: Request):
    payload = req.input or {}
    goal = payload.get("goal")
    if not goal:
        raise HTTPException(status_code=400, detail="missing input.goal")

    config = payload.get("config", {})
    md = req.arize_metadata or {}

    # Run your agent here.
    result = await run_my_agent(goal=goal, **config)

    return {
        "final_response": result.text,
        "tool_calls": result.tool_calls,
        "trace_id": result.trace_id,
    }
Pydantic v2 ignores unknown extra fields by default — arize_metadata will pass cleanly even if you don’t declare it. The example declares it explicitly so you can read from it later for tracing.

Authentication

We recommend one of:
  • Bearer tokenAuthorization: Bearer <your-key>. Simple, works with any HTTP client.
  • API key header — a custom header like X-API-Key: <your-key>.
  • Custom headers — multiple headers if your endpoint requires them.
Set these in the agent configuration’s Headers section; Arize stores them encrypted and replays them on every request. For internal-only endpoints (not exposed to the public internet), agent experiments are not currently supported for cloud-hosted Arize. Self-hosted deployments can use VPC peering.

Register the agent in Arize

1

Open Space Settings → Agents

Click New Agent in the top right.
2

Name and describe

Give the agent a name (e.g. customer-support-v2) and a one-line description of what it does. This is what teammates will see in the agent picker.
3

Set endpoint URL

Paste the full URL, e.g. https://my-agent.example.com/invoke. Arize will not append paths — use the exact URL.
4

Add auth headers

Click Add Header, set Authorization (or your custom header), and paste the value. Add as many as you need.
5

Define input schema

The Input Schema is a JSON Schema that describes the body Arize templates inside the input key. This is what unlocks per-experiment config validation.A minimal schema for the FastAPI example above:
{
  "type": "object",
  "properties": {
    "goal": {
      "type": "string",
      "description": "What the agent should do"
    },
    "config": {
      "type": "object",
      "properties": {
        "model": { "type": "string" },
        "max_turns": { "type": "integer" }
      },
      "additionalProperties": false
    }
  },
  "required": ["goal"],
  "additionalProperties": false
}
6

(Optional) Add request presets

A preset is a named config payload your team can pick from when running an experiment, instead of writing JSON by hand. For example:
  • Production baseline{ "model": "claude-sonnet-4-5", "max_turns": 12 }
  • Opus comparison{ "model": "claude-opus-4-7", "max_turns": 15 }
  • Cost optimized{ "model": "claude-haiku-4-5", "max_turns": 8 }
Presets are what make agent experiments demo-able to PMs and non-engineers.
7

Save

Click Create Agent. The agent now appears in the Run against agent picker on every dataset.

Hydrating the body from dataset columns

When you run an experiment, your request body template uses {{column_name}} placeholders that get replaced with values from each dataset row:
{
  "goal": "{{input}}",
  "config": { "model": "claude-sonnet-4-5" }
}
If your dataset has a column called input, {{input}} is replaced with that row’s value. If the column is named differently — e.g. question, user_prompt — use {{question}} instead. Placeholder names must match column names exactly.

Common issues

Almost always a body shape mismatch. Confirm your endpoint reads from input.goal (or whatever you named it), not goal at the top level — Arize wraps the templated body in input. The full envelope is { "input": {...}, "arize_metadata": {...} }.
The dataset column name doesn’t match the placeholder. Check the column header in your dataset — if it’s prompt, use {{prompt}}. Use {{column_name}}, not {{dataset.column_name}}.
The default request timeout is 120 seconds. For agents that legitimately take longer, raise the timeout in the agent configuration. For agents that run minutes-long workflows, consider an async pattern: return immediately with a job ID, and have the agent push the final result via the Arize API.
Arize runs dataset rows in parallel. If your agent’s downstream API has a rate limit, lower the experiment’s concurrency setting when you launch it.

Next: connect tracing

Your endpoint accepts requests and returns responses — but for the full picture (every LLM call, tool invocation, latency, token use) to land in Arize alongside the experiment, set up tracing next.

Setting up tracing for agent experiments

How traceparent propagation links your agent’s spans to the experiment run, including dynamic per-request space routing.