08.2025

Experiment Traces Improvements

August 19, 2025

The new traces page slide-over enhances the experiment tracing experience, with hover buttons now always visible, experiment traces added to the overflow menu, and search functionality added to the Experiments List Page.

Dataset Management Upgrades

August 18, 2025

The Datasets interface has been improved with CSV upload fixes, search capabilities on the Datasets List Page, and REST API support for dataset deletion.

Experiments Refinements

August 16, 2025

Color maps and diffing functionality have been improved, and the trace metadata now uses experiment IDs for better consistency. The experiment compare headers also feature a pinned experiment button for easier navigation.

Dataset Filtering & REST API Updates

August 15, 2025

Datasets now support improved filtering capabilities, better column organization following semantic conventions, and expanded REST API coverage for listing datasets and examples.

Text areas on dataset example pages can now expand to full column width, and dataset filter history is now preserved.

Dedicated Agent Graph Tab

August 14, 2025

The tracing interface now includes a dedicated Agent Graph tab, making it clearer to visualize and explore agent interactions within traces.

Alyx Copilot API Advancements

August 13, 2025

Copilot API now supports structured output, improved frontend message parsing, and streamlined post-processing workflows, delivering a major upgrade to the AI assistant architecture.

Playground Performance Updates

August 12, 2025

Playground data loading has been improved to boost reliability and performance. Fixed missing metrics displays for AWS Bedrock models, ensuring smoother and more consistent evaluation workflows.

Project UX Enhancements

August 11, 2025

The Projects page now features improved navigation and usability, with the addition of Tasks Provider to simplify task and evaluation management.

Support for GPT-5 in Prompt Playground

Prompt Playground now supports GPT-5, giving users access to the latest OpenAI model for experimentation and evaluation.

Trace Interactivity Improvements

August 6, 2025

Hover states have been added for trace costs, and spans in traces are now clickable—making it easier to explore cost details and navigate through trace data.

Revamped Eval and Tasks Experience

August 4, 2025

The Evals experience has been upgraded with a redesigned Tasks page and updated slideovers that for a cleaner workflow. A save button has been added to Evals slideovers, counters in Datasets now stay up to date, and evaluators automatically refresh from datasets.

Expanded Annotation Configuration Capabilities

August 5, 2025

Annotations now support up to five labels per configuration, giving teams more flexibility to capture nuanced judgments and tailor evaluation workflows to their needs.

This release also adds improved validations, clearer table views, and multiple UI and labeling queue enhancements for a smoother annotation workflow.

Image Support for Datasets and Labeling Queues

August 4, 2025

Datasets now support images in both datasets and labeling queues, with updated column groupings, clearer example ID displays, and reference tokens in headers. This release also introduces download tooltips for datasets and experiments, making it easier to export data directly from the UI.

Experiment Comparison & Data Visualization Improvements

August 2, 2025

The Experiments page has been redesigned with improved UX and richer charting, including new select components and diffing support on the compare page for clearer side-by-side analysis.

Average aggregate metrics are now shown in experiment headers. Usability fixes such as expandable/collapsible tables, editable experiment names, and updated column headers make workflows smoother.

Last updated

Was this helpful?