Documentation Index
Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
The ax evaluators commands are currently in ALPHA. The API may change without notice. A one-time warning is emitted on first use.
The ax evaluators commands let you create and manage LLM-as-judge evaluators and their versions on the Arize platform.
ax evaluators list
List evaluators, optionally filtered by space.
ax evaluators list [--space <id>] [--name <filter>] [--limit <n>] [--cursor <cursor>]
| Option | Description |
|---|
--space | Filter evaluators by space name or ID |
--name | Case-insensitive substring filter on evaluator name |
--limit | Maximum number of results to return (default: 15) |
--cursor | Pagination cursor for the next page |
Examples:
ax evaluators list --space sp_abc123
ax evaluators list --space sp_abc123 --output evaluators.json
ax evaluators create-template-evaluator
Create a new template (LLM-as-judge) evaluator with an initial version. Required options will be prompted interactively if not passed as flags.
ax evaluators create-template-evaluator \
--name <name> \
--space <id> \
--commit-message <message> \
--template-name <name> \
--template <template-string> \
--ai-integration-id <id> \
--model-name <model>
| Option | Description |
|---|
--name, -n | Evaluator name (must be unique within the space) |
--space, -s | Space name or ID to create the evaluator in |
--commit-message | Commit message for the initial version |
--template-name | Eval column name (alphanumeric, spaces, hyphens, underscores) |
--template | Prompt template string with {{variable}} placeholders |
--ai-integration-id | AI integration global ID (base64) |
--model-name | Model name (e.g. gpt-4o) |
--description | Optional evaluator description |
--include-explanations | Include reasoning explanation alongside the score (flag) |
--use-function-calling | Prefer structured function-call output when supported (flag) |
--invocation-params | JSON object of model invocation parameters (e.g. '{"temperature": 0}') |
--provider-params | JSON object of provider-specific parameters |
--classification-choices | JSON object mapping choice labels to numeric scores (e.g. '{"relevant":1,"irrelevant":0}'). Omit for freeform output. |
--direction | Optimization direction: maximize, minimize, or none |
--data-granularity | Data granularity: span, trace, or session |
Example:
ax evaluators create-template-evaluator \
--name "Relevance" \
--space sp_abc123 \
--commit-message "Initial version" \
--template-name "Relevance" \
--template "Is the response relevant to the query?\nQuery: {{input.value}}\nResponse: {{output.value}}" \
--ai-integration-id ai_xyz789 \
--model-name gpt-4o \
--include-explanations \
--invocation-params '{"temperature": 0}' \
--classification-choices '{"relevant":1,"irrelevant":0}'
ax evaluators create-code-evaluator
Create a new code evaluator with an initial version. Use --code-type managed for a built-in check (MatchesRegex, JSONParseable, ContainsAnyKeyword, ContainsAllKeywords, ExactMatch) or --code-type custom to supply Python.
ax evaluators create-code-evaluator \
--name <name> \
--space <id> \
--commit-message <message> \
--code-type managed \
--code-name <name> \
--variables <json-array> \
--managed-evaluator <kind>
| Option | Description |
|---|
--name, -n | Evaluator name (must be unique within the space) |
--space, -s | Space name or ID to create the evaluator in |
--commit-message | Commit message for the initial version |
--code-type | managed (built-in) or custom (user Python) |
--code-name | Eval column name |
--variables | JSON array of span attribute names to pass into the evaluator (e.g. '["output"]'). Inline JSON or a @file path. |
--managed-evaluator | Built-in evaluator (when --code-type managed): MatchesRegex, JSONParseable, ContainsAnyKeyword, ContainsAllKeywords, or ExactMatch |
--code | Python source (when --code-type custom). Inline or @path/to/evaluator.py. |
--imports | Optional Python import block for --code-type custom. Inline or @path/to/imports.py. |
--static-params | JSON array of static parameters. Each item: {name, type: STRING|STRING_ARRAY|REGEX, default_value}. Inline JSON or a @file path. |
--query-filter | Optional filter query applied before evaluation |
--data-granularity | Data granularity: span, trace, or session |
--description | Optional evaluator description |
Example:
ax evaluators create-code-evaluator \
--name "JSON Parseable" \
--space sp_abc123 \
--commit-message "Initial version" \
--code-type managed \
--code-name "json_parseable" \
--variables '["output"]' \
--managed-evaluator JSONParseable
ax evaluators get
Get an evaluator by name or ID, with its resolved version.
ax evaluators get <name-or-id> [--space <id>] [--version-id <id>]
| Option | Description |
|---|
--space | Space name or ID (required when using evaluator name instead of ID) |
--version-id | Specific version ID to retrieve (default: latest version) |
Examples:
ax evaluators get ev_abc123
ax evaluators get "Relevance" --space sp_abc123
ax evaluators get ev_abc123 --version-id evv_xyz789
ax evaluators update
Update an evaluator’s name or description. At least one of --name or --description is required.
ax evaluators update <name-or-id> [--space <id>] [--name <name>] [--description <desc>]
| Option | Description |
|---|
--space | Space name or ID (required when using evaluator name instead of ID) |
--name | New evaluator name |
--description | New evaluator description |
Example:
ax evaluators update ev_abc123 --name "Relevance v2" --description "Updated scoring rubric"
ax evaluators delete
Delete an evaluator and all its versions. This operation is irreversible.
ax evaluators delete <name-or-id> [--space <id>] [--force]
| Option | Description |
|---|
--space | Space name or ID (required when using evaluator name instead of ID) |
--force | Skip the confirmation prompt |
Examples:
ax evaluators delete ev_abc123
ax evaluators delete ev_abc123 --force
ax evaluators delete "Relevance" --space sp_abc123 --force
ax evaluators list-versions
List all versions of an evaluator.
ax evaluators list-versions <name-or-id> [--space <id>] [--limit <n>] [--cursor <cursor>]
| Option | Description |
|---|
--space | Space name or ID (required when using evaluator name instead of ID) |
--limit | Maximum number of versions to return (default: 15) |
--cursor | Pagination cursor for the next page |
Example:
ax evaluators list-versions ev_abc123
ax evaluators create-template-evaluator-version
Create a new template version of an existing template evaluator. Versions are immutable once created; the new version becomes the latest immediately. Required options will be prompted interactively if not passed as flags.
ax evaluators create-template-evaluator-version <name-or-id> \
--commit-message <message> \
--template-name <name> \
--template <template-string> \
--ai-integration-id <id> \
--model-name <model> \
[--space <id>]
| Option | Description |
|---|
--space, -s | Space name or ID (required when using evaluator name instead of ID) |
--commit-message | Commit message describing the changes in this version |
--template-name | Eval column name |
--template | Updated prompt template string with {{variable}} placeholders |
--ai-integration-id | AI integration global ID (base64) |
--model-name | Model name (e.g. gpt-4o) |
--include-explanations | Include reasoning explanation alongside the score (flag) |
--use-function-calling | Prefer structured function-call output when supported (flag) |
--invocation-params | JSON object of model invocation parameters |
--provider-params | JSON object of provider-specific parameters |
--classification-choices | JSON object mapping choice labels to numeric scores. Omit for freeform output. |
--direction | Optimization direction: maximize, minimize, or none |
--data-granularity | Data granularity: span, trace, or session |
Example:
ax evaluators create-template-evaluator-version ev_abc123 \
--commit-message "Improved prompt for edge cases" \
--template-name "Relevance" \
--template "Rate the relevance of the response on a scale of 0 to 1.\nQuery: {{input.value}}\nResponse: {{output.value}}" \
--ai-integration-id ai_xyz789 \
--model-name gpt-4o
ax evaluators create-code-evaluator-version
Create a new code version of an existing code evaluator.
ax evaluators create-code-evaluator-version <name-or-id> \
--commit-message <message> \
--code-type managed \
--code-name <name> \
--variables <json-array> \
--managed-evaluator <kind> \
[--space <id>]
| Option | Description |
|---|
--space, -s | Space name or ID (required when using evaluator name instead of ID) |
--commit-message | Commit message describing the changes in this version |
--code-type | managed (built-in) or custom (user Python) |
--code-name | Eval column name |
--variables | JSON array of span attribute names to pass into the evaluator. Inline JSON or a @file path. |
--managed-evaluator | Built-in evaluator (when --code-type managed) |
--code | Python source (when --code-type custom). Inline or @path/to/evaluator.py. |
--imports | Optional Python import block for --code-type custom. Inline or @path/to/imports.py. |
--static-params | JSON array of static parameters. Inline JSON or a @file path. |
--query-filter | Optional filter query applied before evaluation |
--data-granularity | Data granularity: span, trace, or session |
Example:
ax evaluators create-code-evaluator-version ev_abc123 \
--commit-message "Updated managed evaluator" \
--code-type managed \
--code-name "json_parseable" \
--variables '["output"]' \
--managed-evaluator JSONParseable
ax evaluators get-version
Get a specific evaluator version by ID.
ax evaluators get-version <version-id>
Example:
ax evaluators get-version evv_xyz789