Create a new experiment

curl --request POST \
  --url https://api.arize.com/v2/experiments \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "name": "My Experiment Name",
  "datasetId": "dataset_12345",
  "experimentRuns": [
    {
      "example_id": "example_001",
      "output": "4",
      "model": "gpt-4o-mini",
      "temperature": 0.2,
      "latency_ms": 118,
      "prompt": "Answer the math question briefly."
    },
    {
      "example_id": "example_002",
      "output": "4",
      "model": "gpt-4o-mini",
      "temperature": 0.2,
      "latency_ms": 132
    },
    {
      "example_id": "example_003",
      "output": "4",
      "model": "gpt-4o-mini",
      "temperature": 0.2,
      "latency_ms": 125
    }
  ]
}
'

{
  "id": "<string>",
  "name": "<string>",
  "datasetId": "<string>",
  "datasetVersionId": "<string>",
  "createdAt": "2023-11-07T05:31:56Z",
  "updatedAt": "2023-11-07T05:31:56Z",
  "experimentTracesProjectId": "<string>"
}

Experiments

Create a new experiment

Create a new experiment with a list of JSON objects (or runs). Empty experiments are not allowed. Each run (JSON object) must include an example_id field that corresponds to an example in the dataset, and a output field that contains the task’s output for the example (the input).

The name of the experiment must be unique within a given dataset.

Body containing experiment creation parameters.

Rules

name must be unique within the target dataset.
Provide at least one run in experimentRuns.
Each run must include:
- example_id — the ID of an existing example in the dataset/version
- output — the model/task output for that example
You may include any additional fields per run (e.g., model, latency_ms, temperature, prompt, tool_calls, etc.). These are stored and can be used for analysis/filters.

⚠️ Beta Warning: This endpoint is in beta, read more here.

POST

experiments

Create a new experiment

curl --request POST \
  --url https://api.arize.com/v2/experiments \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "name": "My Experiment Name",
  "datasetId": "dataset_12345",
  "experimentRuns": [
    {
      "example_id": "example_001",
      "output": "4",
      "model": "gpt-4o-mini",
      "temperature": 0.2,
      "latency_ms": 118,
      "prompt": "Answer the math question briefly."
    },
    {
      "example_id": "example_002",
      "output": "4",
      "model": "gpt-4o-mini",
      "temperature": 0.2,
      "latency_ms": 132
    },
    {
      "example_id": "example_003",
      "output": "4",
      "model": "gpt-4o-mini",
      "temperature": 0.2,
      "latency_ms": 125
    }
  ]
}
'

{
  "id": "<string>",
  "name": "<string>",
  "datasetId": "<string>",
  "datasetVersionId": "<string>",
  "createdAt": "2023-11-07T05:31:56Z",
  "updatedAt": "2023-11-07T05:31:56Z",
  "experimentTracesProjectId": "<string>"
}

Authorizations

Authorization

string

header

required

Most Arize AI endpoints require authentication. For those endpoints that require authentication, include your API key in the request header using the format

Authorization: Bearer <api-key>

Body

application/json

Body containing experiment creation parameters

name

string

required

Name of the experiment

datasetId

string

required

ID of the dataset to create the experiment for

experimentRuns

object[]

required

Array of experiment run data

Show child attributes

experimentRuns.exampleId

string

required

ID of the dataset example associated with this experiment run

experimentRuns.output

string

required

output of the task for the matching example

experimentRuns.{key}

Additional user-defined fields in the experiment run

Response

An experiment object

Experiments combine a dataset (example inputs/expected outputs), a task (the function that produces model outputs), and one or more evaluators (code or LLM judges) to measure performance. Each run is stored independently so you can compare runs, track progress, and validate improvements over time. See the full definition on the Experiments page.

Use an experiment to run tasks on a dataset, attach evaluators to score outputs, and compare runs to confirm improvements.

string

required

Unique identifier for the experiment

name

string

required

Name of the experiment

datasetId

string

required

Unique identifier for the dataset this experiment belongs to

datasetVersionId

string

required

Unique identifier for the dataset version this experiment belongs to

createdAt

string<date-time>

required

Timestamp for when the experiment was created

updatedAt

string<date-time>

required

Timestamp for the last update of the experiment

experimentTracesProjectId

string

Unique identifier for the experiment traces project this experiment belongs to (if it exists)

List experiments Get an experiment

⌘I

REST API

Reference

Create a new experiment

Authorizations

Body

Response