May 2026 - Arize AX Docs

Visualize Evaluator Score Distributions Across Spans and Experiments

May 6, 2026 New Dashboards and Visualizations Eval score charts are now available to all users. Visualize how your evaluator scores distribute across spans and experiments directly from the model overview and tracing pages—no configuration required.

Review and Confirm Alyx Proposals Before They Take Effect

May 6, 2026 Improvement Alyx Three Alyx operations that previously applied changes silently now go through a visible confirmation drawer before taking effect. You can review, edit, and accept or skip each proposal before it is saved.

Eval Form Proposals: When Alyx suggests creating or updating an evaluator, it now shows an editable drawer with the proposed name, display name, template, and classification choices. Edit any field before accepting.
Task Creation: Alyx surfaces a review drawer when proposing a new evaluation task, showing the task name, evaluator, target project or dataset, run mode, and sampling rate before the task is created.
Task Configuration: Configuring task parameters through Alyx now always routes through a confirmation drawer, whether you’re on the task-builder page or elsewhere in the platform.

All three operations respect the existing “auto-accept evals & tasks” toggle for workflows that don’t require manual review.

Control Annotation Queue Capacity with Per-Queue Record Limits

May 6, 2026 Improvement Annotations You can now set and clear a custom max_records cap on individual annotation queues from the queue settings UI. A per-queue limit overrides the global account default, so high-volume queues and targeted review queues can each hold the right number of records without a one-size-fits-all ceiling.

Assign Multiple Annotation Queue Records to a Reviewer in Bulk

May 6, 2026 New Annotations Assign multiple annotation queue records to a reviewer in a single operation. Select the records you want to route, choose a reviewer, and submit—no need to assign them one at a time.

Wire Experiment Runs into Automated Pipelines with the run_experiment REST API

May 6, 2026 New SDKs and REST APIs The v2 REST API now supports experiment run tasks. You can create, update, and trigger run_experiment tasks programmatically with the same endpoints used for other task types, making it straightforward to wire experiment runs into automated pipelines.

Fixes and Improvements

May 1–6, 2026

Fix Models and Integrations Azure OpenAI o-family models (o1, o3-mini, o4-mini) now work correctly in Prompt Playground and evals—the default API version is updated to 2025-04-01-preview so you no longer need to enter it manually.
Fix Datasets and Experiments The “View Experiment Traces” button now returns correct results for experiments run via the Arize Python SDK, which uses experiment_id rather than dataset_id.
Fix Evaluators Eval result columns in eval.<name>.<field> format generated by AX experiment evals are no longer dropped before the output is returned.
Fix Evaluators “View Task Logs” from an eval feedback tooltip now opens the exact task run instead of an approximate lookup that failed for renamed evaluators and older runs.

History

Documentation Index

​Visualize Evaluator Score Distributions Across Spans and Experiments

​Review and Confirm Alyx Proposals Before They Take Effect

​Control Annotation Queue Capacity with Per-Queue Record Limits

​Assign Multiple Annotation Queue Records to a Reviewer in Bulk

​Wire Experiment Runs into Automated Pipelines with the run_experiment REST API

​Fixes and Improvements