Skip to main content
POST
/
v2
/
datasets
Create a new dataset
curl --request POST \
  --url https://api.arize.com/v2/datasets \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "name": "Math Questions Dataset",
  "spaceId": "space_12345",
  "examples": [
    {
      "question": "What is 2 + 2?",
      "answer": "4",
      "topic": "arithmetic"
    },
    {
      "question": "What is the square root of 16?",
      "answer": "4",
      "topic": "geometry"
    },
    {
      "question": "If 3x = 12, what is x?",
      "answer": "4",
      "topic": "algebra"
    }
  ]
}
'
{
  "id": "<string>",
  "name": "<string>",
  "spaceId": "<string>",
  "createdAt": "2023-11-07T05:31:56Z",
  "updatedAt": "2023-11-07T05:31:56Z",
  "versions": [
    {
      "id": "<string>",
      "name": "<string>",
      "datasetId": "<string>",
      "createdAt": "2023-11-07T05:31:56Z",
      "updatedAt": "2023-11-07T05:31:56Z"
    }
  ]
}

Authorizations

Authorization
string
header
required

Most Arize AI endpoints require authentication. For those endpoints that require authentication, include your API key in the request header using the format

Authorization: Bearer <api-key>

Body

application/json

Body containing dataset creation parameters

name
string
required
spaceId
string
required
examples
object[]
required

Array of examples for the new dataset

Response

A dataset object

A dataset is a structured collection of examples used to test and evaluate LLM applications. Datasets allow you to test models consistently across any real-world scenarios and edge cases, quickly identify regressions, and track measurable improvements.

id
string
required

Unique identifier for the dataset

name
string
required

Name of the dataset

spaceId
string
required

Unique identifier for the space this dataset belongs to

createdAt
string<date-time>
required

Timestamp for when the dataset was created

updatedAt
string<date-time>
required

Timestamp for the last update of the dataset

versions
object[]

List of versions associated with this dataset