12.05.2024

New Releases, Enhancements, + Changes

Copilot Enhancements

Span Chat

The Copilot Span Chat skill makes getting value from spans faster and easier. Rather than spending time scrolling through and deciphering span data , teams can now:

  • Analyze spans to extract key insights

  • Ask questions to quickly understand span data

  • Run evaluations on individual spans

Span Chat Evaluation

Dashboard Widget Generator

Building dashboard plots just got way easier. The dashboard skill lets teams:

  • Create time series plots or distributions from natural language

  • Translate code (like Plotly) into ready-to-go visualizations

  • Handle ambiguous filters like "west coast states" and plot multiple widgets at once

Dashboard generator

Misc. Copilot Updates

  • We’ve revamped the main chat experience to be always accessible on the page, with an option to collapse the input bar

  • The Custom Metric skill now supports a conversational flow, making it easier for users to iterate and refine metrics dynamically

Additional Enhancements

Experiment Projects

Experiment traces for a dataset are now consolidated and can be accessed under "Experiment Projects" on the "Projects & Models" page.

Experiment Projects

Multi-Class/Label Per-Class Calibration & Chart

We’ve just rolled out per-class calibration metrics and calibration chart. Users can see calibration scores for each class separately and view the calibration chart all in one place.

  • To view per-class calibration simply select calibration from the metric dropdown and choose a class

Per-class calibration
  • The calibration chart can be found under the "More Charts" tab

Calibration Chart

SDK Version 7.29.0

  • Log experiments from a previously created dataframe

📚 New Content

The latest video tutorials, paper readings, ebooks, self-guided learning modules, and technical posts:

🧑‍⚖️ Agent-as-a-Judge: Evaluate Agents with Agents

🤖 LLM-as-a-Judge Evaluation for GenAI Use-Cases

🌎 Building an AI Agent that Thrives in the Real World

🛠️ AI Agent Workflows and Architectures Masterclass

🔬 Agents in the Wild: Geotab

Last updated

Was this helpful?