Documentation Index
Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
The
tasks client methods are currently in ALPHA. The API may change without notice. A one-time warning is emitted on first use.Key Capabilities
- Create project-based tasks that run continuously against live spans
- Create dataset-based tasks that evaluate experiment results
- Trigger on-demand task runs with custom data windows
- Poll task runs until completion with configurable timeout
- Cancel in-progress runs
- List and filter task runs by status
List Tasks
List tasks you have access to, with optional filtering by space, project, dataset, or type.task_type are "template_evaluation" and "code_evaluation".
For details on pagination, field introspection, and data conversion (to dict/JSON/DataFrame), see Response Objects.
Create a Task
Create a new evaluation task. Tasks can target either a project (live spans) or a dataset (experiment results).Project-Based Task
A project-based task continuously evaluates incoming spans. Setis_continuous=True to run the task on every new span, or False to run it only on demand.
Dataset-Based Task
A dataset-based task evaluates examples from one or more experiments. At least oneexperiment_ids entry is required.
Column Mappings and Filters
Each evaluator in the task can have its own column mappings (to map template variables to span attribute names) and a per-evaluator query filter.| Parameter | Type | Description |
|---|---|---|
name | str | Task name. Must be unique within the space. |
task_type | str | "template_evaluation" or "code_evaluation". |
evaluators | list | List of evaluators to attach. At least one required. |
project | str | Target project name or ID. Required when dataset is not provided. |
dataset | str | Target dataset name or ID. Required when project is not provided. |
space | str | Space name or ID used to disambiguate name-based resolution for project and dataset. |
experiment_ids | list[str] | Required (at least one) when dataset is provided. |
sampling_rate | float | Fraction of spans to evaluate (0–1). Project-based tasks only. |
is_continuous | bool | True to run on every new span; False for on-demand only. |
query_filter | str | Task-level SQL-style filter applied to all evaluators. |
Get a Task
Retrieve a task by name or ID. When using a name, providespace to disambiguate.
Task Runs
Trigger a Run
Trigger an on-demand run for a task. The run starts in"pending" status.
| Parameter | Type | Default | Description |
|---|---|---|---|
task | str | required | Task name or ID to trigger. |
space | str | None | Space name or ID used to disambiguate the task lookup. Recommended when resolving by name. |
data_start_time | datetime | None | Start of data window to evaluate. |
data_end_time | datetime | now | End of data window. Defaults to the current time. |
max_spans | int | 10 000 | Maximum number of spans to process. |
override_evaluations | bool | False | Re-evaluate data that already has labels. |
experiment_ids | list[str] | None | Experiment IDs to run against (dataset-based tasks only). |
List Runs
List runs for a task with optional status filtering.status values: "pending", "running", "completed", "failed", "cancelled".
Get a Run
Retrieve a specific run by its ID.Cancel a Run
Cancel a run that is currently"pending" or "running".
Wait for a Run
Poll a run until it reaches a terminal state ("completed", "failed", or "cancelled").
TimeoutError if the run does not complete within timeout seconds.