Release notes 12-5

Arize Release Notes: Copilot Enhancements, Experiment Projects, and More

Published Dec 5, 2024

Sarah Welsh

Contributor

Welcome to our regular update on new releases, enhancements, and changes.

What’s New

Copilot Enhancements

Span Chat

The Copilot Span Chat skill makes getting value from spans faster and easier. Rather than spending time scrolling through and deciphering span data , teams can now:

Analyze spans to extract key insights
Ask questions to quickly understand span data
Run evaluations on individual spans

spanchat eval — Span Chat Evaluation

Dashboard Widget Generator

Building dashboard plots just got way easier. The dashboard skill lets teams…

Create time series plots or distributions from natural language
Translate code (like Plotly) into ready-to-go visualizations
Handle ambiguous filters like “west coast states” and plot multiple widgets at once

Dashboard generator screenshot — Dashboard generator

Misc. Copilot Updates

We’ve revamped the main chat experience to be always accessible on the page, with an option to collapse the input bar
The Custom Metric skill now supports a conversational flow, making it easier for users to iterate and refine metrics dynamically

Additional Enhancements

Experiment Projects

Experiment traces for a dataset are now consolidated and can be accessed under “Experiment Projects” on the “Projects & Models” page.

experiment projects screenshot — Experiment projects

Multi-Class/Label Per-Class Calibration & Chart

We’ve just rolled out per-class calibration metrics and calibration chart. Users can see calibration scores for each class separately and view the calibration chart all in one place.

To view per-class calibration simply select calibration from the metric dropdown and choose a class

Per-class calibration screenshot — Per-class calibration

The calibration chart can be found under the “More Charts” tab

Calibration chart

SDK Version 7.29.0

Log experiments from a previously created dataframe

📚 New Content

The latest video tutorials, paper readings, ebooks, self-guided learning modules, and technical posts:

🧑‍⚖️ Agent-as-a-Judge: Evaluate Agents with Agents
🤖 LLM-as-a-Judge Evaluation for GenAI Use-Cases
🌎 Building an AI Agent that Thrives in the Real World
🛠️ AI Agent Workflows and Architectures Masterclass
🔬 Agents in the Wild: Geotab

Share

Suggested reading

Meet PXI: the AI engineering agent inside Phoenix

Arize skills for your coding agents

Arize Skills: Coding Agent Workflows for Traces, Evals, and Instrumentation