March 2026

SDKs & REST APIs

March 13–19, 2026 New SDK clients and REST API endpoints for managing platform resources programmatically:

Evaluators (Python & JavaScript SDKs): Create, manage, and version evaluators programmatically through Python and JavaScript SDKs. Full create, read, update, and delete operations for evaluators, plus list, create, and retrieve for evaluator versions. Enables automated evaluator lifecycle management and integration of evaluation workflows into existing development processes.
Prompts (Python & JavaScript SDKs): Manage prompts and prompt versions through Python and JavaScript SDKs with full create, read, update, and delete operations across all prompt and version endpoints. Includes label management for organizing and retrieving specific versions. Set labels like “production” on any version for easy resolution, and labels automatically move when reassigned.
AI Integrations (Python SDK): Connect Arize AX with AI frameworks programmatically through the Python SDK. List, create, update, and delete integrations to automate the setup of instrumentation and monitoring across your AI applications.
Roles Management: Create, read, update, and delete custom roles through the API, enabling programmatic role management and automated access control workflows across your organization.
Name-Based Search: Find resources faster with case-insensitive name search across all major list endpoints including projects, prompts, datasets, experiments, spaces, annotation configs, and annotation queues. Flexible substring matching enables quick resource location without remembering exact names.
Space Deletion: Manage the full space lifecycle through the API with the ability to delete spaces programmatically, completing the set of space management operations.

March 6–11, 2026

Prompt Version & Label Management: Create, retrieve, and manage prompt versions with labels like “production” for easy resolution. List all versions of a prompt, create new versions with commit messages, and set/replace labels on versions. Labels automatically move when reassigned, ensuring the correct version is always referenced.
API Keys Management (Python & JavaScript SDKs): Create, list, and delete API keys programmatically through Python and JavaScript SDKs. Two types of keys available: user keys that inherit individual user permissions and can be created in multiples, and service keys tied to bot users for organizational continuity. Enables secure automation and integration with external systems while ensuring service accounts remain functional even when team members leave.
AI Integrations: List, create, update, and delete integrations with cursor-based pagination and space filtering. Connect Arize AX with various AI frameworks like OpenAI Agents, LangGraph, and Autogen for automatic instrumentation and monitoring with just a few lines of code, dramatically simplifying observability infrastructure.
Evaluators: List evaluators programmatically with cursor-based pagination and space filtering. Each evaluator includes ID, name, description, space ID, task type, tags, current version, and timestamps. Transform subjective AI outputs into measurable, trackable metrics that enable teams to confidently iterate on applications and ensure consistent quality at scale.
Annotation Queue Records: Retrieve annotation queue records via REST API with pagination support. View annotations, assigned users with status, source data, and evaluations for each record. Streamlines the annotation workflow by providing programmatic access to centralized data labeling processes, ensuring organized and consistent annotation tracking across teams.

Evaluator Improvements

March 13–19, 2026

Generic Variables with Auto-Mapping: Streamlined evaluator setup with intelligent variable mapping. When creating an evaluator from a template or blank, recognized variables like input, output, question, reference, and tool_call are automatically mapped based on the selected datasource type (span attributes for Projects or column names for Datasets). Override any mapping with a single click via dropdown or freetext for full customization.
Better Validation for Code Evaluators: Enhanced validation and error messaging when creating or editing code evaluators, with clearer feedback on syntax errors and missing required fields. Catches issues early before running full evaluations, reducing wasted compute and iteration time.
Online Task Resources Configuration: Configure CPU and memory resources for online task evaluators to optimize performance and cost based on workload requirements. Provides more granular control over how evaluations run at scale.

Alyx Improvements

March 13–18, 2026

Multi-Span Support: Alyx can now analyze and work with multiple spans simultaneously, enabling more powerful conversational debugging and analysis workflows across traces without switching context.
Dataset Page Context: Alyx now has richer awareness on dataset pages, including selected experiments, latest available experiments, and active evaluators. Provides more relevant, context-specific assistance when working with datasets and experiments.
Auto-Trigger on Destination Pages: When the home page Alyx links to another page, it can now auto-open the destination page’s Alyx with context pre-loaded, providing seamless continued help across navigation.

March 6–9, 2026

Bedrock Integration: Use AWS Bedrock integrations as the model powering Alyx. Pass provider parameters (region, anthropic_version, etc.) through the full stack from UI to generative services. A warning icon appears when a Bedrock integration is selected without a configured region, opening a dialog to persist parameters to localStorage for automatic message retry.
Session Reading: Alyx can now read and analyze sessions with two new tools: get session table preview for an overview of the sessions table, and get session data to see all traces within a session. Regardless of selected time range, tools that fetch assets by ID now look in a minimum 60-day window. Improved large JSON compression logic maintains valid JSON structure and adds pagination indicators.

Annotation Improvements

March 13–19, 2026

Queue Records Management: Add and delete records from annotation queues with new record operations. Assignee and status filters now persist in the URL for easy sharing and bookmarking, with an empty state UI for guidance when no filters are applied.
Session Slideover Annotations: The annotate button has moved from individual messages to the trace level in session conversations for clearer context. A trace number subtitle now appears in the annotation panel, making it clear which trace is being annotated.
View Source Data: Clicking “View Source Data” in annotation queues now opens the full span dialog instead of a limited preview, providing complete trace context for more informed annotation decisions.

CLI Commands

March 13–17, 2026 New CLI commands for managing spaces and profiles:

Spaces Management: Create, list, get, and update spaces directly from the command line with formatted table output. The Spaces API has been promoted to stable, providing reliable programmatic access to space management.
Profile Recovery: The CLI now gracefully handles invalid or extra configuration fields instead of crashing. A new profile fix command helps diagnose and repair broken profile configurations, enabling quick recovery from configuration issues without manual file editing.

March 9, 2026 New CLI commands for managing annotation configs and exporting data:

Annotation Configs CRUD: Create, list, and delete annotation configs through the CLI. Reusable schemas define how to structure human feedback and evaluations across your workspace with consistent rubrics (categorical labels, continuous scores, or freeform text). Consistent annotations enable better error analysis, help build high-quality training datasets, and provide reliable ground truth data for improving prompts and model fine-tuning.
Dataset, Experiment, and Span Export: Export datasets, experiments, and spans using new CLI commands. Download all examples from datasets, all runs from experiments, or spans filtered by trace/span/session ID. Append examples to existing datasets from inline JSON or files (CSV, JSON, JSONL, Parquet) with client-side structural validation and server-side field-level validation.

Trace & Session Improvements

March 17–18, 2026

Saved Filters Migration: Saved trace filters have been automatically migrated to the new trace views system with improved comparison logic that handles filter order correctly. Existing saved filters continue to work seamlessly.
Sessions in Filter Slideover: The filter slideover now includes a sessions table, making it easier to build filters based on session-level attributes and metadata alongside existing trace filters.
Copy Trace Events: Quickly copy event details (name, message, stacktrace, and timestamp) from trace slideover events with a single click, streamlining debugging workflows and issue reporting.

Dashboard Export

March 13, 2026

Export dashboard data and visualizations for offline analysis and reporting. CSV export for tabular data like token usage breakdowns, and PDF export for complete dashboard snapshots. Works across all dashboard types including tracing project overviews, making it easy to share insights with stakeholders who don’t have platform access.

Webhooks

March 18, 2026

New webhook delivery infrastructure for monitor and prompt notifications. Deliver pre-built payloads to configured webhook URLs via HTTP POST with HMAC-SHA256 signature verification, delivery attempt tracking, and optional custom headers per destination. Best-effort fanout ensures one destination failure does not block others. The webhook creation form now includes an optional authorization token field for endpoint security.

Bedrock Integration Updates

March 16–17, 2026

Bearer Token Authentication: Added support for AWS Bedrock integrations using bearer token authentication, providing a simpler alternative to IAM-based credential management for teams that prefer token-based auth workflows.
Inference Profile Support: AWS Bedrock inference profiles are now fully supported without requiring model IDs to be in a predefined allowlist. Unknown inference profile identifiers pass through directly to Bedrock for native error handling, enabling immediate use of custom or newly released profiles.

March 19, 2026 Quickly access usage metrics and plan details through a new dedicated section in the side navigation. Monitor consumption against entitlements at a glance without navigating to separate settings pages, making it easier to stay informed about platform usage.

Dataset Summary Columns

March 17, 2026 Dataset pages now display summary evaluation scores and annotation counts with token usage metrics in dedicated columns. Quickly assess dataset quality and annotation progress at a glance without opening individual records, streamlining your data review workflow.

IT Administrators Role

March 19, 2026 New custom IAM role for IT administrators with resource management and API key permissions. Provides the right level of access for IT operations tasks like managing API keys and viewing project resources without granting broader permissions, following the principle of least privilege for security operations. March 9, 2026 Access your tracing projects directly from the navigation menu with a hover menu displaying your top 5 projects. This dedicated workspace organizes all related traces, evaluations, and sessions for a specific use case or model in one place, allowing you to isolate different products or models and visualize agent execution patterns to identify bottlenecks and understand agent reasoning. March 5, 2026 Create code evaluators directly from the New Evaluator dropdown menu. The evaluator-first creation experience includes an inline data source selector for projects and datasets. Code evaluators provide fast, consistent, and efficient evaluation for objective criteria like keyword checks, URL validation, or compliance rules without the variability of LLM-based judgments.

Test LLM Evaluator While Creating

March 5, 2026 Test evaluators with a single-shot evaluation against the currently selected datasource row and see results (label/score/explanation) inline while creating or editing. The “Test Evaluator” button in the preview variables panel allows validation of evaluation criteria, input variables, and expected outputs before running full evaluations, saving time and ensuring accurate quality assessments.

Gemini Provider Support

March 6, 2026 Connect your Google API key to access Gemini and Gemma models directly within the Arize platform. Use Gemini models for prompt playground experimentation, model evaluation, and function calling with structured outputs. Gemini models support function calling and structured output; Gemma models do not support these features but are available for general tasks.

History

SDKs & REST APIs

Evaluator Improvements

Alyx Improvements

Annotation Improvements

CLI Commands

Trace & Session Improvements

Dashboard Export

Webhooks

Bedrock Integration Updates

Usage Module in Navigation

Dataset Summary Columns

IT Administrators Role

Projects in Navigation

Test LLM Evaluator While Creating

Gemini Provider Support

History

Documentation Index

​SDKs & REST APIs

​Evaluator Improvements

​Alyx Improvements

​Annotation Improvements

​CLI Commands

​Trace & Session Improvements

​Dashboard Export

​Webhooks

​Bedrock Integration Updates

​Usage Module in Navigation

​Dataset Summary Columns

​IT Administrators Role

​Projects in Navigation

​Code Evaluator in New Evaluator Dropdown

​Test LLM Evaluator While Creating

​Gemini Provider Support

SDKs & REST APIs

Evaluator Improvements

Alyx Improvements

Annotation Improvements

CLI Commands

Trace & Session Improvements

Dashboard Export

Webhooks

Bedrock Integration Updates

Usage Module in Navigation

Dataset Summary Columns

IT Administrators Role

Projects in Navigation

Code Evaluator in New Evaluator Dropdown

Test LLM Evaluator While Creating

Gemini Provider Support