Only this pageAll pages
Powered by GitBook
1 of 65

Release Notes

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

07.2025

05.2025

04.2025

02.2025

06.2025

03.2025

06.13.2025: Session Filtering 🪄

Available in Phoenix 10.12+

New Features:

  • Added an optional sessionId argument to the Project.sessions GraphQL field, enabling filtering by session_id.

  • Integrated support across the backend resolver and frontend UI to seamlessly filter and display sessions matching a specific session_id.

04.16.2025: API Key Generation via API 🔐

Available in Phoenix 8.26+

Phoenix now supports programmatic API key creation through a new endpoint, making it easier to automate project setup and trace logging. To enable this, set the PHOENIX_ADMIN_SECRET environment variable in your deployment.

Improvements and Bug Fixes 🐛

  • Tracing: Add load more and loading state to the infinite scroll

  • UI: Hide menu for changing role for self in UsersTable

  • Security: Prevent admins from changing their own roles

  • Infrastructure: Remove WebSocket dependency and migrate to Multipart Subscriptions

03.19.2025: Access to New Integrations in Projects 🔌

Available in Phoenix 8.15+

In the New Project tab, we've added quick setup to instrument your application for BeeAI, SmolAgents, and the OpenAI Agents SDK.

Easily configure all integrations with streamlined instructions. Check out all Phoenix tracing integrations here.

02.18.2025: One-Line Instrumentation⚡️

Available in Phoenix 8.0+

Phoenix has made it even simpler to get started with tracing by introducing one-line auto-instrumentation. By using register(auto_instrument=True), you can enable automatic instrumentation in your application, which will set up instrumentors based on your installed packages.

from phoenix.otel import register

register(auto_instrument=True)

For more details, you can check the docs and explore further options.

07.25.2025: Project Dashboards 📈

Available in Phoenix 11.12+

In the latest release, Arize Phoenix now includes dedicated project dashboards featuring:

  • Trace latency and error metrics

  • Latency quantiles

  • Annotation scores over time

  • Cost trends by token type

  • Top models ranked by cost and token usage

  • LLM invocation and error tracking

  • Tool calls and error statistics

You can set the project dashboard as the default view for your project in the configuration page.

Learn more .

07.21.2025: Project and Trace Management via GraphQL 📤

Available in Phoenix 11.9+

New Features:

  • Added transferTracesToProject GraphQL mutation to move traces between projects, preserving annotations and cost calculations for seamless reorganization.

  • Added createProject GraphQL mutation to create new projects programmatically via the API.

07.18.2025: OpenInference Java ✨

OpenInference Java is now available, providing a comprehensive solution for tracing AI applications using OpenTelemetry. Fully compatible with any OpenTelemetry-compatible collector or backend like Arize.

Included in this release:

  • openinference-semantic-conventions: Java constants for capturing model calls, embeddings, and tool usage.

  • openinference-instrumentation: Core utilities for manual OpenInference instrumentation.

  • openinference-instrumentation-langchain4j: Auto-instrumentation for LangChain4j applications.

All libraries are published and ready to add to your build to initialize tracing and capture rich AI traces.

Learn more:

07.13.2025: Experiments Module in phoenix-client 🧪

Available in Phoenix 11.7+

New Features in Phoenix 11.7+:

  • Added a new experiments property to both Client and AsyncClient for invoking experiment workflows.

  • Introduced Experiments and AsyncExperiments classes with run_experiment methods supporting tasks, evaluators, dry-run mode, and metadata.

  • Implemented SyncExecutor and AsyncExecutor classes for concurrent execution with built-in progress bars.

  • Added RateLimiter and AdaptiveTokenBucket for intelligent handling and throttling of rate-limit errors.

Bug Fixes:

  • Fixed a typo in the datasets.get_dataset_versions docstring.

Enhancements:

  • Introduced a PhoenixException base class and refactored exception imports for consistency.

  • Simplified rate limiter output by replacing printif with direct print statements.

07.09.2025: Baseline for Experiment Comparisons 🔁

Available in Phoenix 11.4+

You can now set a baseline run when comparing multiple experiments. This is especially useful when one run represents a known-good output (e.g. a previous model version or a CI-approved run), and you want to evaluate changes relative to it.

For example, in an evaluation like accuracy, you can easily see where the value flipped from correct → incorrect or incorrect → correct between your baseline and the current comparison - helping you quickly spot regressions or improvements.

This feature makes it easier to isolate the impact of changes like a new prompt, model, or dataset.

07.07.2025: Databse Disk Usage Monitor 🛑

Available in Phoenix 11.5+

New Features:

  • Added a disk usage monitor daemon that periodically checks storage consumption.

  • Sends warning emails to administrators when usage crosses a configured threshold.

  • Blocks insert/update operations when usage exceeds a higher critical threshold.

  • Introduced configurable environment variables for warning and blocking thresholds with validation.

  • Integrated disk usage checks into both the FastAPI app and gRPC serve to enforce write blocked.

Enhancements:

  • Extended the email sender with a method and HTML template specifically for disk usage alert notifications.

07.03.2025: Cost Summaries in Trace Headers 💸

Available in Phoenix 11.4+

You can now see total and segmented costs directly in your Phoenix trace headers for faster debugging and spend visibility.

New Features:

  • Extended TraceDetails GraphQL query to include costSummary fields (prompt, completion, total).

  • Passes costSummary data into TraceHeader and displays formatted total cost.

  • Adds a tooltip in TraceHeader showing prompt vs. completion cost breakdown.

06.25.2025: Cost Tracking 💰

Available in Phoenix 11.0+

Phoenix now allows you to track token-based costs for LLM runs automatically, calculating costs from token counts and model pricing data and rolling them up to trace and project levels for comprehensive analysis.

New Features:

  • Automatic calculation of token-based costs using Phoenix’s built-in model pricing table.

  • Support for custom pricing configurations in Settings > Models when needed.

  • Token counts and model information are captured automatically when using OpenInference auto-instrumentation with OpenAI, Anthropic, and other supported SDKs.

  • For manual instrumentation, token count attributes can be included in spans to enable cost tracking.

  • OpenTelemetry users can leverage OpenInference semantic conventions to include token counts in LLM spans.

More Information in our documentation:

06.25.2025: New Phoenix Cloud ☁️

We’ve added a comprehensive management and provisioning layer to Phoenix, enabling enhanced team collaboration and access control.

New Features:

  • Ability to create and manage multiple customized Phoenix spaces tailored to different teams and use cases.

  • Granular user access management for each individual space.

  • Support for multiple users collaborating within the same Phoenix projects.

06.25.2025: Amazon Bedrock Support in Playground 🛝

Available in Phoenix 10.15+

Phoenix’s Playground now supports Amazon Bedrock, allowing users to run prompts directly against Bedrock-hosted models within the platform.

New Features:

  • Run prompts on Amazon Bedrock models seamlessly from Phoenix’s Playground.

  • Compare outputs side-by-side with other model providers for better evaluation.

  • Instantly track usage metrics, latency, and cost associated with Bedrock models.

  • Fine-tune prompt strategies within Phoenix without needing to switch tools.

06.13.2025: Enhanced Span Creation and Logging 🪐

Available in Phoenix 10.12+

New Features:

  • Added POST /projects/{project_identifier}/spans route for span ingestion.

  • Added log_spans client method to submit a sequence of spans, rejecting the entire batch if any span is invalid or not unique.

  • Added log_spans_dataframe for submitting spans as a dataframe.

  • Introduced uniquify_spans and uniquify_spans_dataframe helpers to regenerate span and trace IDs while preserving relationships.

  • Improved validation and error handling to prevent partial ingestion and ensure safe, conflict-free span creation.

Example Usage

06.12.2025: Dataset Filtering 🔍

Available in Phoenix 10.11+

This release enables filtering of datasets by name across both the API and user interface, integrating a live search input along with support for pagination and sorting to improve data navigation and usability.

  • Added a DatasetFilter input and enum to the GraphQL schema, allowing users to filter datasets by name using case-insensitive matching.

  • Created a debounced DatasetsSearch component on the Datasets page that lets users filter results live as they type.

06.06.2025: Experiment Progress Graph 📊

Available in Phoenix 10.9+

New visualizations Phoenix provide deeper insights into experiment performance over time.

With Experiment Progress Charts, you can now:

  • Visualize how evaluation scores evolve across experiment runs

  • Monitor evaluator performance and detect regressions

  • Analyze latency trends to identify bottlenecks and inefficiencies

These collapsible visual tools eliminate the need for manual inspection and make it significantly easier to track the impact of changes in your LLM or agent workflows.

06.04.2025: Ollama Support in Playground 🛝

Available in Phoenix 10.7+

We’ve added support for Ollama in the Playground, enabling you to experiment with and customize model parameters directly within the platform for more flexible and tailored prompt versioning.

04.28.2025: Improved Shutdown Handling 🛑

Available in Phoenix 8.28+

When stopping the Phoenix server via Ctrl+C, the shutdown process now exits cleanly without displaying a traceback or returning a non-zero exit code. Previously, a KeyboardInterrupt and CancelledError traceback could appear, ending the process with status code 130. The server now swallows the interrupt for a smoother shutdown experience, exiting with code 0 by default to reflect intentional termination.

Improvements and Bug Fixes 🐛

  • : Use Float for token count summaries

  • : Improve browser compatibility for table sizing

  • : Simplify homeLoaderQuery to prevent idle timeout errors

04.18.2025: Tracing for MCP Client-Server Applications 🔌

Available in Phoenix 8.26+

We’re excited to announce a powerful capability in the OSS library openinference-instrumentation-mcp — seamless OTEL context propagation for MCP clients and servers.

What’s New?

This release introduces automatic distributed tracing for Anthropic’s Model Context Protocol (MCP). Using OpenTelemetry, you can now:

  • Propagate context across MCP client-server boundaries

  • Generate end-to-end traces of your AI system across services and languages

  • Gain full visibility into how models access and use external context

The openinference-instrumentation-mcp package handles this for you by:

  • Creating spans for MCP client operations

  • Injecting trace context into MCP requests

  • Extracting and continuing the trace context on the server

  • Associating the context with OTEL spans on the server side

Set up

  1. Instrument both MCP client and server with OpenTelemetry.

  2. Add the openinference-instrumentation-mcp package.

  3. Spans will propagate across services, appearing as a single connected trace in Phoenix.

Full example usage is available:

Walkthrough Video

Acknowledgments

Big thanks to Adrian Cole and Anuraag Agrawal for their contributions to this feature.

04.15.2025: Display Tool Call and Result IDs in Span Details 🫆

Available in Phoenix 8.25+

Tool call and result IDs are now shown in the span details view. Each ID is placed within a collapsible header and can be easily copied. This update also supports spans with multiple tool calls. Get started with tracing your tool calls .

Improvements and Bug Fixes 🐛

  • Performance: Do not refetch tables when trace and span details closed

  • UI: Redirect /v1/traces to root path

  • Playground: Update GPT-4.1 models in Playground

04.03.2025: Phoenix Client Prompt Tagging 🏷️

Available in Phoenix 8.22+

We’ve added support for Prompt Tagging in the Phoenix client. This new feature gives you more control and visibility over your prompts throughout the development lifecycle.

  • Tag prompts directly in your code and see those tags reflected in the Phoenix UI.

  • Label prompt versions as development, staging, or production — or define your own custom tags.

  • Add tag descriptions to provide additional context or list out all tags.

Check out documentation on .

Improvements and Bug Fixes 🐛

  • : Add aiohttp to container for azure-identity

03.24.2025: Tracing Configuration Tab 🖌️

Available in Phoenix 8.19+

Within each project, there is now a Config tab to enhance customization. The default tab can now be set per project, ensuring the preferred view is displayed.

Learn more in .

Improvements and Bug Fixes 🐛

  • : Use correlated subquery for orphan spans

  • : Add toggle to treat orphan spans as root

  • : Upgrade react-router, vite, vitest

  • Experiments: Included delete experiment option to action menu

  • Feature: Added support for specifying admin users via an environment variable at startup

  • Annotation: Now displays metadata

  • Settings Page: Now split across tabs for improved navigation and easier access

  • Feedback: Added full metadata

  • Projects: Improved performance

  • UI: Added date format descriptions to explanations

07.11.2024: Hosted Phoenix and LlamaTrace 💻

Phoenix is now available for deployment as a fully hosted service.

In addition to our existing notebook, CLI, and self-hosted deployment options, we’re excited to announce that Phoenix is now available as a .

With hosted instances, your data is stored between sessions, and you can easily share your work with team members.

We are partnering with LlamaIndex to power a new observability platform in LlamaCloud: LlamaTrace. LlamaTrace will automatically capture traces emitted from your LlamaIndex applications, and store them in a persistent, cloud- accessible Phoenix instance.

Hosted Phoenix is 100% free-to-use, .

01.18.2025: Automatic & Manual Span Tracing ⚙️

Available in Phoenix 7.9+

In addition to using our automatic instrumentors and tracing directly using OTEL, we've now added our own layer to let you have the granularity of manual instrumentation without as much boilerplate code.

You can now access a tracer object with streamlined options to trace functions and code blocks. The main two options are:

  • Using the decorator @tracer.chain traces the entire function automatically as a Span in Phoenix. The input, output, and status attributes are set based on the function's parameters and return value.

  • Using the tracer in a with clause allows you to trace specific code blocks within a function. You manually define the Span name, input, output, and status.

Check out the for more on how to use tracer objects.

07.02.2024: Function Call Evaluations ⚒️

Available in Phoenix 4.6+

We are introducing a new built-in function call evaluator that scores the function/tool-calling capabilities of your LLMs. This off-the-shelf evaluator will help you ensure that your models are not just generating text but also effectively interacting with tools and functions as intended.

This evaluator checks for issues arising from function routing, parameter extraction, and function generation.

Check out a .

Documentation
Maven Central Packages

01.2025

2024

from phoenix.client import Client
from phoenix.client.helpers.spans import uniquify_spans

client = Client()

spans = [
    {
        "name": "llm_call",
        "context": {"trace_id": "trace_123", "span_id": "span_456"},
        "start_time": "2024-01-15T10:00:00Z",
        "end_time": "2024-01-15T10:00:05Z",
        "span_kind": "LLM"
    }
]

unique_spans = uniquify_spans(spans)
result = client.spans.log_spans(
    project_identifier="my-project",
    spans=unique_spans,
)
here
Ollama models
Fix
Enhancement
UX
OpenInference
here
prompt tags
Infrastructure
Tracing
Spans
Performance
fully hosted service
check it out today
docs
full walkthrough of the evaluator

07.29.2025: Google GenAI Evals 🌐

We’ve added support for the GoogleGenAIModel in phoenix-evals, enabling direct access to Google's Gemini models through the official Google GenAI SDK. As of late 2024, this is the recommended approach for working with Gemini, offering a unified interface across both the Developer API and VertexAI.

🚀 Key Features

  • Multimodal Support Run evaluations on text, image, and audio inputs using Gemini’s multimodal capabilities.

  • Async-Ready Optimized for high-throughput evals with full async compatibility.

  • Flexible Authentication Supports both API key and VertexAI-based authentication methods.

  • Dynamic Rate Limiting Built-in rate limiter with automatic adjustment based on API feedback and usage patterns.

This integration makes it easier to run robust, scalable evaluations using Gemini models directly within your phoenix-evals workflows.

Huge shoutout to Siddharth Sahu for this contribution!

More Information in our docs:

07.25.2025: Average Metrics in Experiment Comparison Table 📊

Available in Phoenix 11.12+

The experiment comparison table now displays average experiment run data in the table headers, making it easier to spot high-level differences across runs at a glance.

07.02.2025: Cursor MCP Button ⚡️

Available in Phoenix 11.3+

Cursor IDE Integration

You can now click “Add to Cursor” directly in the Phoenix README to get a continuously updating MCP server configuration integrated into your IDE. This makes it seamless to keep your Phoenix + MCP setup in sync while developing with Cursor.

New phoenix-support Tool for Agents

The phoenix-support tool from @arizeai/phoenix-mcp@2.2.0 allows Agents like Cursor, Claude, and Windsurf to:

  • Look up Phoenix and OpenInference documentation and best practices.

  • Use this information to make code changes automatically in your workspace.

  • For Example: Watch Cursor 1-shot instrument a LlamaIndex app using Phoenix without manual intervention.

05.05.2025: OpenInference Google GenAI Instrumentation 🧩

We’ve added a Python auto-instrumentation library for the Google GenAI SDK. This enables seamless tracing of GenAI workflows with full OpenTelemetry compatibility. Traces can be exported to any OpenTelemetry collector.

Installation

pip install openinference-instrumentation-google-genai

For more details on how to set up the tracing integration seamlessly:

Additionally, the Google GenAI instrumentor is now supported and works seamlessly with Span Replay in Phoenix, enabling deep trace inspection and replay for more effective debugging and observability.

Acknowledgements

Big thanks to Harrison Chu for his contributions.

04.30.2025: Span Querying & Data Extraction for Phoenix Client 📊

Available in Phoenix 8.30+

The Phoenix client now includes the SpanQuery DSL, enabling more advanced and flexible span querying for distributed tracing and telemetry data. This allows users to perform complex queries on span data, improving trace analysis and debugging.

In addition, the get_spans_dataframe method has been migrated, offering an easy-to-use way to extract span-related information as a Pandas DataFrame. This simplifies data processing and visualization, making it easier to analyze trace data within Python-based environments.

Improvements and Bug Fixes 🐛

  • Projects: Add "Copy Name" button to project menu

  • TLS: Add independent flags for whether TLS is enabled for HTTP and gRPC servers

  • Playground: Log playground subscription errors

  • API: New RBAC primitives have been introduced for FastAPI and REST APIs

04.09.2025: Project Management API Enhancements ✨

Available in Phoenix 8.24+

This update enhances the Project Management API with more flexible project identification:

  • Enhanced project identification: Added support for identifying projects by both ID and hex-encoded name and introduced a new _get_project_by_identifier helper function

Also includes streamlined operations, better validation & error handling, and expanded test coverage.

Improvements and Bug Fixes 🐛

  • Performance: Restore streaming

  • Playground: update Gemini models

  • Enhancement: Route user to forgot-password page in welcome email url

04.09.2025: New REST API for Projects with RBAC 📽️

Available in Phoenix 8.23+

This release introduces a REST API for managing projects, complete with full CRUD functionality and access control. Key features include:

  • CRUD Operations: Create, read, update, and delete projects via the new API endpoints.

  • Role-Based Access Control:

    • Admins can create, read, update, and delete projects

    • Members can create and read projects, but cannot modify or delete them.

  • Additional Safeguards: Immutable Project Names, Default Project Protection, Comprehensive Integration Tests

Check out our new documentation to test these features.

Improvements and Bug Fixes 🐛

  • Phoenix Server: add PHOENIX_ALLOWED_ORIGINS env

  • Tracing: Delete annotations in the feedback table, Make feedback table scrollable

  • Experiments: Allow scrolling the entire experiment compare table

  • Projects: Make time range selector more accessible

  • Playground: Don't close model settings dialog when picking Azure version

  • Session: improve PostgreSQL error message in launch_app

04.02.2025 Improved Span Annotation Editor ✍️

Available in Phoenix 8.21+

The new span aside moves the Span Annotation editor into a dedicated panel, providing a clearer view for adding annotations and enhancing customization of your setup. Read this documentation to learn how annotations can be used.

Improvements and Bug Fixes 🐛

  • Enhancement: Allow the option to have no configured working directory when using Postgres

  • Performance: Cache project table results when toggling the details slide-over for improved performance

  • UI: Add chat and message components for note-taking

04.01.2025: Support for MCP Span Tool Info in OpenAI Agents SDK 🔨

Available in Phoenix 8.20+

Newly added to the OpenAI Agent SDK is support for MCP Span Info, allowing for the tracing and extraction of useful information about MCP tool listings. Use the Phoenix OpenAI Agents SDK for powerful agent tracing.

03.27.2025 Span View Improvements 👀

Available in Phoenix 8.20+

You can now toggle the option to treat orphan spans as root when viewing your spans. Additionally, we've enhanced the UI with an icon view in span details for better visibility in smaller displays. Learn more in our .

Improvements and Bug Fixes 🐛

  • Performance: Disable streaming when a dialog is open

  • Playground: Removed unpredictable playground transformations

03.21.2025: Environment Variable Based Admin User Configuration 🗝️

Available in Phoenix 8.17+

You can now specify one or more admin users at startup using an environment variable. This is especially useful for managed deployments, allowing you to define admin access in a manifest or configuration file. The specified users will be automatically seeded into the database, enabling immediate login without manual setup.

Improvements and Bug Fixes 🐛

  • Performance: Smaller page sizes

  • Projects: Improved performance on projects page

  • Experiments: Allow hover anywhere on experiment cell

  • Annotations: Show metadata

  • Feedback: Show full metadata

03.20.2025: Delete Experiment from Action Menu 🗑️

Available in Phoenix 8.19+

You can now delete experiments directly from the action menu, making it quicker to manage and clean up your workspace. This update streamlines experiment management by reducing the steps needed to remove outdated or unnecessary runs. Get started with experiments .

Improvements and Bug Fixes 🐛

  • UI: Show the date format in the explanation

03.14.2025: OpenAI Agents Instrumentation 📡

Available in Phoenix 8.13+

We've introduced the OpenAI Agents SDK for Python which provides enhanced visibility into agent behavior and performance.

Installation

pip install openinference-instrumentation-openai-agents openai-agents
  • Includes an OpenTelemetry Instrumentor that traces agents, LLM calls, tool usage, and handoffs.

  • With minimal setup, use the register function to connect your app to Phoenix and view real-time traces of agent workflows.

For more details on a quick setup, check out our integration documentation:

Walkthrough Video

Improvements and Bug Fixes 🐛

  • Prompt Playground: Azure API key made optional, included specialized UI for thinking budget parameter

  • Performance: Make the spans table the default tab

  • Components: Added react-aria Tabs components

  • Enhancement: Download experiment runs and annotations as CSV

07.18.2024: Guardrails AI Integrations💂

Available in Phoenix 4.11+

Our integration with Guardrails AI allows you to capture traces on guard usage and create datasets based on these traces. This integration is designed to enhance the safety and reliability of your LLM applications, ensuring they adhere to predefined rules and guidelines.

Check out the Cookbook here.

09.26.2024: Authentication & RBAC 🔐

Available in Phoenix 5.0+

We've added Authentication and Rules-based Access Controls to Phoenix. This was a long-requested feature set, and we're excited for the new uses of Phoenix this will unlock!

The auth feature set includes:

  • Secure Access: All of Phoenix’s UI & APIs (REST, GraphQL, gRPC) now require access tokens or API keys. Keep your data safe!

  • RBAC (Role-Based Access Control): Admins can manage users; members can update their profiles—simple & secure.

  • API Keys: Now available for seamless, secure data ingestion & querying.

  • OAuth2 Support: Easily integrate with Google, AWS Cognito, or Auth0. ✉ Password Resets via SMTP to make security a breeze.

For all the details on authentication, view our docs.

Bug Fixes and Improvements 🐛

  • Numerous stability improvements to our hosted Phoenix instances accessed on app.phoenix.arize.com

  • Added a new command to easily launch a Phoenix client from the cli: phoenix serve

  • Implemented simple email sender to simplify dependencies

  • Improved error handling for imported spans

  • Replaced hdbscan with fast-hdbscan. Added PHOENIX_CSRF_TRUSTED_ORIGINS environment variable to set trusted origins

  • Added support for Mistral 1.0

  • Fixed an issue that caused px.Client().get_spans_dataframe() requests to time out

05.09.2025: Annotations, Data Retention Policies, Hotkeys 📓

Available in Phoenix 9.0.0+

Phoenix v9.0.0 release brings major updates to annotation support, and a whole host of other improvements.

🏷️

Up until now, Phoenix has only supported one annotation of a given type on each trace. We've now unlocked that limit, allowing you to capture multiple values of an annotation label on each span.

In addition, we've added:

  • API support for annotations - create, query, and update annotations through the REST API

  • Additional support for code evaluations as annotations

  • Support for arbitrary metadata on annotations

  • Annotation configurations to structure your annotations within and across projects

💿

Now you can create custom global and per-project data retention polices to remove traces after a certain window of time, or based on number of traces. Additionally, you can now view your disk usage in the Settings page of Phoenix.

Hotkeys 🔥

We've added hotkeys to Phoenix!

You can now use j and k to quickly page through your traces, and e and n to add annotations and notes - you never have to lift your hands off the keyboard again!

Full v9.0.0 Release

04.28.2025: TLS Support for Phoenix Server 🔐

Available in Phoenix 8.29+

Phoenix now supports Transport Layer Security (TLS) for both HTTP and gRPC connections, enabling encrypted communication and optional mutual TLS (mTLS) authentication. This enhancement provides a more secure foundation for production deployments.

Highlights:

  • Secure HTTP & gRPC Connections: Phoenix can now serve over HTTPS and secure gRPC.

  • Flexible TLS Configuration: TLS settings are managed via environment variables.

  • Optional Client Verification: Support for mTLS with configurable client certificate validation.

  • Improved Testing: TLS-aware infrastructure added to integration tests.

  • Better Visibility: Server startup logs now display TLS status.

Configuration Options

Set the following environment variables to enable and customize TLS:

Variable
Type
Description

Note: Encrypted private keys require the cryptography Python package for decryption.

03.07.2025: New Prompt Playground, Evals, and Integration Support 🦾

Available in Phoenix 8.9+

New update overview:

  • Prompt Playground: Now supports & Anthropic Sonnet 3.7 and Thinking Budgets

  • Instrumentation: to trace smolagents by Hugging Face

  • Evals: o3 support, Audio & Multi-Modal Evaluations

  • Integrations: Phoenix now supports &

Improvements and Bug Fixes 🐛

  • : Show percent used of DB

  • : Add environment variable for allocated DB storage capacity

  • : Delete selected traces

  • : Make trace tree more readable on smaller sizes

  • : Ensure type is correct on run_experiment

  • : Allow experiment run JSON downloads

  • : Add anthropic thinking config param

  • : Add ToggleButton

03.07.2025: Model Config Enhancements for Prompts 💡

Available in Phoenix 8.11+

  • Save and Load from Prompts: You can now save and load configurations directly from prompts.

  • Save and Load from Default Model Config: Default model configurations can be saved and loaded.

  • Budget Token Management: Added the ability to adjust the budget token value.

  • Thinking Configuration Toggle: You can now enable or disable the “thinking” feature.

Important Note: The default model config does not automatically apply to saved prompts. To include default thinking settings, ensure they are saved within the specific prompt.

Improvements and Bug Fixes 🐛

  • : Added annotations to experiment JSON downloads

  • : Add none as option for tool choice for anthropic 0.49.0

  • : Port slider component to react-aria

03.06.2025: Project Improvements 📽️

Available in Phoenix 8.5+

We’ve introduced several enhancements to Projects, providing greater flexibility and control over how you interact with data. These updates include:

  • : Your selected columns will now remain consistent across sessions, ensuring a more seamless workflow.

  • : Easily filter data directly from the table view using metadata attributes.

  • Custom Time Ranges: You can now specify custom time ranges to filter traces and spans.

  • Root Span Filter for Spans: Improved filtering options allow you to filter by root spans, helping to isolate and debug issues more effectively.

  • : Quickly apply common metadata filters for faster navigation.

  • : Major speed improvements in project tracing views & visibility into database usage in settings

Improvements and Bug Fixes 🐛

  • : Query to get number of spans for each trace

  • : Show + n more spans in trace table

  • : Add Token component

  • : Remove double fetching of spans

  • : Don't fetch new traces when the traces slideover is visible

  • : Fix scrolling on trace tree

11.18.2024: Prompt Playground 🛝

Available in Phoenix 6.0+

Sessions allow you to group multiple responses into a single thread. Each response is still captured as a single trace, but each trace is linked together and presented in a combined view.

Sessions make it easier to visual multi-turn exchanges with your chatbot or agent Sessions launches with Python and TS/JS support. For more on sessions, check out and the .

Bug Fixes and Improvements 🐛

  • Added support for FastAPI and GraphQL extensions

  • Fixed a bug where Anthropic LLM as a Judge responses would be labeled as unparseable

  • Fixed a bug causing 500 errors on client.get_traces_dataset() and client.get_spans_dataframe()

  • Added the ability for authentication to work from behind a proxy

  • Added an environment variable to set default admin passwords in auth

07.03.2024: Datasets & Experiments 🧪

Available in Phoenix 4.6+

Datasets: Datasets are a new core feature in Phoenix that live alongside your projects. They can be imported, exported, created, curated, manipulated, and viewed within the platform, and should make a few flows much easier:

  • Fine-tuning: You can now create a dataset based on conditions in the UI, or by manually choosing examples, then export these into CSV or JSONL formats ready-made for fine-tuning APIs.

  • Experimentation: External datasets can be uploaded into Phoenix to serve as the test cases for experiments run in the platform.

For more details on using datasets see our or .

Experiments: Our new Datasets and Experiments feature enables you to create and manage datasets for rigorous testing and evaluation of your models. You can now run comprehensive experiments to measure and analyze the performance of your LLMs in various scenarios.

For more details, check out our full .

12.09.2024: Sessions 💬

Available in Phoenix 7.0+

Sessions allow you to group multiple responses into a single thread. Each response is still captured as a single trace, but each trace is linked together and presented in a combined view.

Sessions make it easier to visual multi-turn exchanges with your chatbot or agent Sessions launches with Python and TS/JS support. For more on sessions, check out and the .

Bug Fixes and Improvements 🐛

  • Prompt Playground: Added support for arbitrary string model names Added support for Gemini 2.0 Flash Improved template editor ergonomics

  • Evals: Added multimodal message template support

  • Tracing: Added JSON pretty printing for structured data outputs (thank you sraibagiwith100x!) Added a breakdown of token types in project summary

  • Bug Fixes: Changed trace latency to be computed every time, rather than relying on root span latency, Added additional type checking to handle non-string values when manually instrumenting (thank you Manuel del Verme!)

PHOENIX_TLS_ENABLED

boolean

Enable or disable TLS (true/false)

PHOENIX_TLS_CERT_FILE

string

Path to TLS certificate file

PHOENIX_TLS_KEY_FILE

string

Path to private key file

PHOENIX_TLS_KEY_FILE_PASSWORD

string

Password for encrypted private key file

PHOENIX_TLS_CA_FILE

string

Path to CA certificate (for client verification)

PHOENIX_TLS_VERIFY_CLIENT

boolean

Enable client cert verification

Annotations
GPT-4.5
Admin
Configuration
Tracing
Tracing
Experiments
Experiments
Python Client
Components
Experiments
Playground
UI
Persistent Column Selection on Tables
Metadata Filters from the Table
Metadata Quick Filters
Performance
GraphQL
Performance
Components
Performance
Performance
UI
a walkthrough video
docs
documentation
example notebook
walkthrough
a walkthrough video
docs

04.25.2025: Scroll Selected Span Into View 🖱️

Available in Phoenix 8.27+

Improved trace navigation by automatically scrolling the selected span into view when a user navigates to a specific trace. This enhancement eliminates the need for manual searching or scrolling, allowing users to immediately focus on the span of interest. It's especially useful when navigating from links or alerts that point to a specific span, improving debugging efficiency. This change contributes to a smoother and more intuitive trace exploration experience.

Improvements and Bug Fixes 🐛

  • Enhancement: Add /readyz endpoint to confirm database connectivity

  • Enhancement: Allow scroll on settings page

02.19.2025: Prompts 📃

Available in Phoenix 8.0+

Phoenix prompt management will now let you create, modify, tag, and version control prompts for your applications. Some key highlights from this release:

  • Versioning & Iteration: Seamlessly manage prompt versions in both Phoenix and your codebase.

  • New TypeScript Client: Sync prompts with your JavaScript runtime, now with native support for OpenAI, Anthropic, and the Vercel AI SDK.

  • New Python Client: Sync templates and apply them to AI SDKs like OpenAI, Anthropic, and more.

  • Standardized Prompt Handling: Native normalization for OpenAI, Anthropic, Azure OpenAI, and Google AI Studio.

  • Enhanced Metadata Propagation: Track prompt metadata on Playground spans and experiment metadata in dataset runs.

Check out the docs and this for more on prompts.📝

03.18.2025: Resize Span, Trace, and Session Tables 🔀

Available in Phoenix 8.14+

We've added the ability to resize Span, Trace, and Session tables. Resizing preferences are now persisted in the tracing store, ensuring settings are maintained per-project and per-table.

Improvements and Bug Fixes 🐛

  • : Remove shadow on button group

  • : Fixed broken popovers

06.03.2025: Deploy via Helm ☸️

Available in Phoenix 10.6+

We’re excited to announce that Phoenix can now be deployed via a Helm chart for Kubernetes.

This allows you to:

  • Quickly spin up Phoenix with a single helm install and a single YAML file.

  • Launch using the infrastructure and deployment patterns recommended by the Phoenix team, ensuring consistency and ease of maintenance.

  • Easily upgrade to the latest Phoenix features and improvements over time.

Whether you are self-hosting in a cloud Kubernetes cluster or on-premises, the new Helm chart makes deploying Phoenix simpler and more reliable than ever.

Set up Instructions

05.30.2025: xAI and Deepseek Support in Playground 🛝

Available in Phoenix 10.5+

Phoenix v10.5.0 now supports Deepseek and xAI models in Playground natively. Previous versions of Phoenix supported these as custom model endpoints, but that process has now been streamlined to offer these model providers from the main Playground dropdown.

walkthrough
UI
UI

Release Notes

The latest from the Phoenix team.

07.29.2025: Google GenAI Evals 🌐

New in phoenix-evals: Added support for Google's Gemini models via the Google GenAI SDK — multimodal, async, and ready to scale. Huge shoutout to Siddharth Sahu for this contribution!


07.25.2025: Project Dashboards 📈

Available in Phoenix 11.12+

Phoenix now has comprehensive project dashboards for detailed performance, cost, and error insights.


07.25.2025: Average Metrics in Experiment Comparison Table 📊

Available in Phoenix 11.12+

View average run metrics directly in the headers of the experiment comparison table for quick insights.


07.21.2025: Project and Trace Management via GraphQL 📤

Available in Phoenix 11.9+

Create new projects and transfer traces between them via GraphQL, with full preservation of annotations and cost data.


07.18.2025: OpenInference Java ✨

OpenInference Java now offers full OpenTelemetry-compatible tracing for AI apps, including auto-instrumentation for LangChain4j and semantic conventions.


07.13.2025: Experiments Module in phoenix-client 🧪

Available in Phoenix 11.7+

New experiments feature set in phoenix-client, enabling sync and async execution with task runs, evaluations, rate limiting, and progress reporting.


07.09.2025: Baseline for Experiment Comparisons 🔁

Available in Phoenix 11.6+

Compare experiments relative to a baseline run to easily spot regressions and improvements across metrics.


07.07.2025: Database Disk Usage Monitor 🛑

Available in Phoenix 11.5+

Monitor database disk usage, notify admins when nearing capacity, and automatically block writes when critical thresholds are reached.


07.03.2025: Cost Summaries in Trace Headers 💸

Available in Phoenix 11.4+

Added cost summaries to trace headers, showing total and segmented (prompt & completion) costs at a glance while debugging.


07.02.2025: Cursor MCP Button ⚡️

Available in Phoenix 11.3+

Phoenix README now has a “Add to Cursor” button for seamless IDE integration with Cursor. @arizeai/phoenix-mcp@2.2.0 also includes a new tool called phoenix-support, letting agents like Cursor auto-instrument your apps using Phoenix and OpenInference best practices.


06.25.2025: Cost Tracking 💰

Available in Phoenix 11.0+

Phoenix now automatically tracks token-based LLM costs using model pricing and token counts, rolling them up to trace and project levels for clear, actionable cost insights.


06.25.2025: New Phoenix Cloud ☁️

Phoenix now supports multiple customizable spaces with individual user access and collaboration, enabling teams to work together seamlessly.


06.25.2025: Amazon Bedrock Support in Playground 🛝

Available in Phoenix 10.15+

Phoenix’s Playground now supports Amazon Bedrock, letting you run, compare, and track Bedrock models alongside others—all in one place.


06.13.2025: Session Filtering 🪄

Available in Phoenix 10.12+

Now you can filter sessions by their unique session_id across the API and UI, making it easier to pinpoint and inspect specific sessions.


06.13.2025: Enhanced Span Creation and Logging 🪐

Available in Phoenix 10.12+

Now you can create spans directly via a new POST API and client methods, with helpers to safely regenerate IDs and prevent conflicts on insertion.


06.12.2025: Dataset Filtering 🔍

Available in Phoenix 10.11+

Dataset name filtering with live search support across the API and UI.


06.06.2025: Experiment Progress Graph 📊

Available in Phoenix 10.9+

Phoenix now has experiment graphs to track how your evaluation scores and latency evolve over time.


06.04.2025: Ollama Support in Playground 🛝

Ollama is now supported in the Playground, letting you experiment with its models and customize parameters for tailored prompting.


06.03.2025: Deploy Phoenix via Helm ☸️

Available in Phoenix 10.6+

Added Helm chart support for Phoenix, making Kubernetes deployment fast, consistent, and easy to upgrade.


05.30.2025: xAI and Deepseek Support in Playground 🛝

Available in Phoenix 10.7+

Deepseek and xAI models are now available in Prompt Playground!


05.20.2025: Datasets and Experiment Evaluations in the JS Client 🧪

We've added a host of new methods to the JS client:

  • getExperiment - allows you to retrieve an Experiment to view its results, and run evaluations on it

  • evaluateExperiment - allows you to evaluate previously run Experiments using LLM as a Judge or Code-based evaluators

  • createDataset - allows you to create Datasets in Phoenix using the client

  • appendDatasetExamples - allows you to append additional examples to a Dataset


05.14.2025: Experiments in the JS Client 🔬

You can now run Experiments using the Phoenix JS client! Use Experiments to test different iterations of your applications over a set of test cases, then evaluate the results. This release includes:

  • Native tracing of tasks and evaluators

  • Async concurrency queues

  • Support for any evaluator (including bring your own evals)


05.09.2025: Annotations, Data Retention Policies, Hotkeys 📓

Available in Phoenix 9.0+

Major Release: Phoenix v9.0.0

Phoenix's v9.0.0 release brings with it:

  • A host of improvements to Annotations, including one-to-many support, API access, annotation configs, and custom metadata

  • Customizable data retention policies

  • Hotkeys! 🔥


05.05.2025: OpenInference Google GenAI Instrumentation 🧩

We’ve added a Python auto-instrumentation library for the Google GenAI SDK. This enables seamless tracing of GenAI workflows with full OpenTelemetry compatibility. Additionally, the Google GenAI instrumentor is now supported and works seamlessly with Span Replay in Phoenix.


04.30.2025: Span Querying & Data Extraction for PX Client 📊

Available in Phoenix 8.30+

The Phoenix client now includes the SpanQuery DSL for more advanced span querying. Additionally, a get_spans_dataframe method has been added to facilitate easier data extraction for span-related information.


04.28.2025: TLS Support for Phoenix Server 🔐

Available in Phoenix 8.29+

Phoenix now supports Transport Layer Security (TLS) for both HTTP and gRPC connections, enabling encrypted communication and optional mutual TLS (mTLS) authentication. This enhancement provides a more secure foundation for production deployments.


04.28.2025: Improved Shutdown Handling 🛑

Available in Phoenix 8.28+

When stopping the Phoenix server via Ctrl+C, the shutdown process now exits cleanly with code 0 to reflect intentional termination. Previously, this would trigger a traceback with KeyboardInterrupt, misleadingly indicating a failure.


04.25.2025: Scroll Selected Span Into View 🖱️

Available in Phoenix 8.27+

Improved trace navigation by automatically scrolling the selected span into view when a user navigates to a specific trace. This enhances usability by making it easier to locate and focus on the relevant span without manual scrolling.


04.18.2025: Tracing for MCP Client-Server Applications 🔌

Available in Phoenix 8.26+

We’ve released openinference-instrumentation-mcp, a new package in the OpenInference OSS library that enables seamless OpenTelemetry context propagation across MCP clients and servers. It automatically creates spans, injects and extracts context, and connects the full trace across services to give you complete visibility into your MCP-based AI systems.

Big thanks to Adrian Cole and Anuraag Agrawal for their contributions to this feature.


04.16.2025: API Key Generation via API 🔐

Available in Phoenix 8.26+

Phoenix now supports programmatic API key creation through a new endpoint, making it easier to automate project setup and trace logging. To enable this, set the PHOENIX_ADMIN_SECRET environment variable in your deployment.


04.15.2025: Display Tool Call and Result IDs in Span Details 🫆

Available in Phoenix 8.25+

Tool call and result IDs are now shown in the span details view. Each ID is placed within a collapsible header and can be easily copied. This update also supports spans with multiple tool calls. Get started with tracing your tool calls here.


04.09.2025: Project Management API Enhancements ✨

Available in Phoenix 8.24+

This update enhances the Project Management API with more flexible project identification We've added support for identifying projects by both ID and hex-encoded name and introduced a new _get_project_by_identifier helper function.


04.09.2025: New REST API for Projects with RBAC 📽️

Available in Phoenix 8.23+

This release introduces a REST API for managing projects, complete with full CRUD functionality and access control. Key features include CRUD Operations and Role-Based Access Control. Check out our new documentation to test these features.


04.03.2025: Phoenix Client Prompt Tagging 🏷️

Available in Phoenix 8.22+

We’ve added support for Prompt Tagging in the Phoenix client. This new feature gives you more control and visibility over your prompts throughout the development lifecycle. Tag prompts directly in code, label prompt versions, and add tag descriptions. Check out documentation on prompt tags.


04.02.2025 Improved Span Annotation Editor ✍️

Available in Phoenix 8.21+

The new span aside moves the Span Annotation editor into a dedicated panel, providing a clearer view for adding annotations and enhancing customization of your setup. Read this documentation to learn how annotations can be used.


04.01.2025: Support for MCP Span Tool Info in OpenAI Agents SDK 🔨

Available in Phoenix 8.20+

Newly added to the OpenAI Agent SDK is support for MCP Span Info, allowing for the tracing and extraction of useful information about MCP tool listings. Use the Phoenix OpenAI Agents SDK for powerful agent tracing.


03.27.2025 Span View Improvements 👀

Available in Phoenix 8.20+

You can now toggle the option to treat orphan spans as root when viewing your spans. Additionally, we've enhanced the UI with an icon view in span details for better visibility in smaller displays. Learn more .


03.24.2025: Tracing Configuration Tab 🖌️

Available in Phoenix 8.19+

Within each project, there is now a Config tab to enhance customization. The default tab can now be set per project, ensuring the preferred view is displayed. Learn more in .


03.21.2025: Environmental Variable Based Admin User Configuration 🗝️

Available in Phoenix 8.17+

You can now preconfigure admin users at startup using an environment variable, making it easier to manage access during deployment. Admins defined this way are automatically seeded into the database and ready to log in.


03.20.2025: Delete Experiment from Action Menu 🗑️

Available in Phoenix 8.16+

You can now delete experiments directly from the action menu, making it quicker to manage and clean up your workspace.


03.19.2025: Access to New Integrations in Projects 🔌

Available in Phoenix 8.15+

In the New Project tab, we've added quick setup to instrument your application for BeeAI, SmolAgents, and the OpenAI Agents SDK. Easily configure these integrations with streamlined instructions. Check out all Phoenix tracing integrations here.


03.18.2025: Resize Span, Trace, and Session Tables 🔀

Available in Phoenix 8.14+

We've added the ability to resize Span, Trace, and Session tables. Resizing preferences are now persisted in the tracing store, ensuring settings are maintained per-project and per-table.


03.14.2025: OpenAI Agents Instrumentation 📡

Available in Phoenix 8.13+

We've introduced the OpenAI Agents SDK for Python which provides enhanced visibility into agent behavior and performance. For more details on a quick setup, check out our docs.

pip install openinference-instrumentation-openai-agents openai-agents

03.07.2025: Model Config Enhancements for Prompts 💡

Available in Phoenix 8.11+

You can now save and load configurations directly from prompts or default model settings. Additionally, you can adjust the budget token value and enable/disable the "thinking" feature, giving you more control over model behavior and resource allocation.


03.07.2025: New Prompt Playground, Evals, and Integration Support 🦾

Available in Phoenix 8.9+

Prompt Playground now supports new GPT and Anthropic models with enhanced configuration options. Instrumentation options have been improved for better traceability, and evaluation capabilities have expanded to cover Audio & Multi-Modal Evaluations. Phoenix also introduces new integration support for LiteLLM Proxy & Cleanlabs evals.


03.06.2025: Project Improvements 📽️

Available in Phoenix 8.8+

We’ve rolled out several enhancements to Projects, offering more flexibility and control over your data. Key updates include persistent column selection, advanced filtering options for metadata and spans, custom time ranges, and improved performance for tracing views. These changes streamline workflows, making data navigation and debugging more efficient.

Check out docs for more.


02.19.2025: Prompts 📃

Available in Phoenix 8.0+

Phoenix prompt management will now let you create, modify, tag, and version control prompts for your applications. Some key highlights from this release:

  • Versioning & Iteration: Seamlessly manage prompt versions in both Phoenix and your codebase.

  • New TypeScript Client: Sync prompts with your JavaScript runtime, now with native support for OpenAI, Anthropic, and the Vercel AI SDK.

  • New Python Client: Sync templates and apply them to AI SDKs like OpenAI, Anthropic, and more.

  • Standardized Prompt Handling: Native normalization for OpenAI, Anthropic, Azure OpenAI, and Google AI Studio.

  • Enhanced Metadata Propagation: Track prompt metadata on Playground spans and experiment metadata in dataset runs.

Check out the docs and this walkthrough for more on prompts!📝


02.18.2025: One-Line Instrumentation⚡️

Available in Phoenix 8.0+

Phoenix has made it even simpler to get started with tracing by introducing one-line auto-instrumentation. By using register(auto_instrument=True), you can enable automatic instrumentation in your application, which will set up instrumentors based on your installed packages.

from phoenix.otel import register

register(auto_instrument=True)

01.18.2025: Automatic & Manual Span Tracing ⚙️

Available in Phoenix 7.9+

In addition to using our automatic instrumentors and tracing directly using OTEL, we've now added our own layer to let you have the granularity of manual instrumentation without as much boilerplate code.

You can now access a tracer object with streamlined options to trace functions and code blocks. The main two options are using the decorator @tracer.chain and using the tracer in a with clause.

Check out the docs for more on how to use tracer objects.


12.09.2024: Sessions 💬

Available in Phoenix 7.0+

Sessions allow you to group multiple responses into a single thread. Each response is still captured as a single trace, but each trace is linked together and presented in a combined view.

Sessions make it easier to visualize multi-turn exchanges with your chatbot or agent. Sessions launches with Python and TS/JS support. For more on sessions, check out a walkthrough video and the docs.


11.18.2024: Prompt Playground 🛝

Available in Phoenix 6.0+

Prompt Playground is now available in the Phoenix platform! This new release allows you to test the effects of different prompts, tools, and structured output formats to see which performs best.

  • Replay individual spans with modified prompts, or run full Datasets through your variations.

  • Easily test different models, prompts, tools, and output formats side-by-side, directly in the platform.

  • Automatically capture traces as Experiment runs for later debugging. See here for more information on Prompt Playground, or jump into the platform to try it out for yourself.


09.26.2024: Authentication & RBAC 🔐

Available in Phoenix 5.0+

We've added Authentication and Rules-based Access Controls to Phoenix. This was a long-requested feature set, and we're excited for the new uses of Phoenix this will unlock!

The auth feature set includes secure access, RBAC, API keys, and OAuth2 Support. For all the details on authentication, view our docs.


07.18.2024: Guardrails AI Integrations💂

Available in Phoenix 4.11.0+

Our integration with Guardrails AI allows you to capture traces on guard usage and create datasets based on these traces. This integration is designed to enhance the safety and reliability of your LLM applications, ensuring they adhere to predefined rules and guidelines.

Check out the Cookbook here.


07.11.2024: Hosted Phoenix and LlamaTrace 💻

Phoenix is now available for deployment as a fully hosted service.

In addition to our existing notebook, CLI, and self-hosted deployment options, we’re excited to announce that Phoenix is now available as a fully hosted service. With hosted instances, your data is stored between sessions, and you can easily share your work with team members.

We are partnering with LlamaIndex to power a new observability platform in LlamaCloud: LlamaTrace. LlamaTrace will automatically capture traces emitted from your LlamaIndex application.

Hosted Phoenix is 100% free-to-use, check it out today!


07.03.2024: Datasets & Experiments 🧪

Available in Phoenix 4.6+

Datasets: Datasets are a new core feature in Phoenix that live alongside your projects. They can be imported, exported, created, curated, manipulated, and viewed within the platform, and make fine-tuning and experimentation easier.

For more details on using datasets see our documentation or example notebook.

Experiments: Our new Datasets and Experiments feature enables you to create and manage datasets for rigorous testing and evaluation of your models. Check out our full walkthrough.


07.02.2024: Function Call Evaluations ⚒️

Available in Phoenix 4.6+

We are introducing a new built-in function call evaluator that scores the function/tool-calling capabilities of your LLMs. This off-the-shelf evaluator will help you ensure that your models are not just generating text but also effectively interacting with tools and functions as intended. Check out a full walkthrough of the evaluator.

feat: createProject GraphQL mutation by mikeldking · Pull Request #8660 · Arize-ai/phoenixGitHub
feat: Add `log_spans` to client and REST API by anticorrelator · Pull Request #8005 · Arize-ai/phoenixGitHub
phoenix/tutorials/mcp/tracing_between_mcp_client_and_server at main · Arize-ai/phoenixGitHub

05.20.2025: Datasets and Experiment Evaluations in the JS Client 🧪

We've added a host of new methods to the JS client:

  • - allows you to retrieve an Experiment to view its results, and run evaluations on it

  • - allows you to evaluate previously run Experiments using LLM as a Judge or Code-based evaluators

  • - allows you to create Datasets in Phoenix using the client

  • - allows you to append additional examples to a Dataset

Full list of supported JS/TS Client Methods:

feat: allow filtering of sessions by session_id by RogerHYang · Pull Request #8038 · Arize-ai/phoenixGitHub
Release arize-phoenix: v8.25.0 · Arize-ai/phoenixGitHub
feat(tracing): add a config tab by mikeldking · Pull Request #6857 · Arize-ai/phoenixGitHub
feat: add database disk usage monitor by RogerHYang · Pull Request #8402 · Arize-ai/phoenixGitHub
feat(experiments): add baseline to compare experiments page by axiomofjoy · Pull Request #8461 · Arize-ai/phoenixGitHub
feat: gracefully handle ctrl-c by codefromthecrypt · Pull Request #7305 · Arize-ai/phoenixGitHub
feat: dataset-filter by GeLi2001 · Pull Request #7982 · Arize-ai/phoenixGitHub
Release arize-phoenix: v8.26.0 · Arize-ai/phoenixGitHub
Release arize-phoenix: v8.22.0 · Arize-ai/phoenixGitHub
feat(traces): Implement trace project transfer API by mikeldking · Pull Request #8645 · Arize-ai/phoenixGitHub
Google GenAI | Phoenix
getExperiment
evaluateExperiment
createDataset
appendDatasetExamples
feat: add cost summary to trace header by RogerHYang · Pull Request #8406 · Arize-ai/phoenixGitHub
feat: ollama by GeLi2001 · Pull Request #7846 · Arize-ai/phoenixGitHub
Cost Tracking | Phoenix
feat(experiments): experiment progress chart by mikeldking · Pull Request #7978 · Arize-ai/phoenixGitHub
feat: Add experiments module to phoenix-client by anticorrelator · Pull Request #8375 · Arize-ai/phoenixGitHub
feat: capture result from MCPListToolsSpanData by RogerHYang · Pull Request #1458 · Arize-ai/openinferenceGitHub
feat: add-cursor-mcp-button by GeLi2001 · Pull Request #8390 · Arize-ai/phoenixGitHub

05.14.2025: Experiments in the JS Client 🔬

You can now run Experiments using the Phoenix JS client! Use Experiments to test different iterations of your applications over a set of test cases, then evaluate the results.

This release includes:

  • Native tracing of tasks and evaluators

  • Async concurrency queues

  • Support for any evaluator (including bring your own evals)

Code Implementation

Logo
Google Gen AI Evals | Phoenixarize.com
import { createClient } from "@arizeai/phoenix-client";
import {
  asEvaluator,
  runExperiment,
} from "@arizeai/phoenix-client/experiments";
import type { Example } from "@arizeai/phoenix-client/types/datasets";
import { Factuality } from "autoevals";
import OpenAI from "openai";

const phoenix = createClient();
const openai = new OpenAI();

/** Your AI Task */
const task = async (example: Example) => {
  const response = await openai.chat.completions.create({
    model: "gpt-4o",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: JSON.stringify(example.input, null, 2) },
    ],
  });
  return response.choices[0]?.message?.content ?? "No response";
};

await runExperiment({
  dataset: "dataset_id",
  experimentName: "experiment_name",
  client: phoenix,
  task,
  evaluators: [
    asEvaluator({
      name: "Factuality",
      kind: "LLM",
      evaluate: async (params) => {
        const result = await Factuality({
          output: JSON.stringify(params.output, null, 2),
          input: JSON.stringify(params.input, null, 2),
          expected: JSON.stringify(params.expected, null, 2),
        });
        return {
          score: result.score,
          label: result.name,
          explanation: (result.metadata?.rationale as string) ?? "",
          metadata: result.metadata ?? {},
        };
      },
    }),
  ],
});
feat(experiments): display average experiment run data in headers of experiment compare table by axiomofjoy · Pull Request #8737 · Arize-ai/phoenixGitHub
openinference-instrumentation-google-genaiPyPI
Logo
feat: allow project name as identifier in REST path by RogerHYang · Pull Request #7064 · Arize-ai/phoenixGitHub
feat: REST API for CRUD operations on projects by RogerHYang · Pull Request #7006 · Arize-ai/phoenixGitHub
Annotation Configs
Logo
Logo
feat: Move Span Annotation Editor into Span Aside by cephalization · Pull Request #6937 · Arize-ai/phoenixGitHub
feat: Add `SpanQuery` DSL to phoenix client and include `get_spans_dataframe` to client by anticorrelator · Pull Request #7071 · Arize-ai/phoenixGitHub
Release arize-phoenix: v8.17.0 · Arize-ai/phoenixGitHub
Release arize-phoenix: v8.20.0 · Arize-ai/phoenixGitHub
Annotation Improvements
Release arize-phoenix: v9.0.0 · Arize-ai/phoenixGitHub
Annotation Improvements
OpenAI Agents SDK | Phoenix
feat(tracing): scroll selected span into view when navigating to a trace by mikeldking · Pull Request #7227 · Arize-ai/phoenixGitHub
Release arize-phoenix: v8.16.0 · Arize-ai/phoenixGitHub
feat: environment variables for TLS by RogerHYang · Pull Request #7296 · Arize-ai/phoenixGitHub
Release arize-phoenix: v8.0.0 · Arize-ai/phoenixGitHub
Logo
Logo
Release arize-phoenix: v8.14.0 · Arize-ai/phoenixGitHub
Release arize-phoenix: v8.11.0 · Arize-ai/phoenixGitHub
Kubernetes (helm) | Phoenix
feat: Playground add deepseek by GeLi2001 · Pull Request #7675 · Arize-ai/phoenixGitHub
Logo
Logo
Logo
Logo
Experiments CLI output
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
@arizeai/phoenix-client
arize-phoenix-clientPyPI
Logo
feat: xai to playground by GeLi2001 · Pull Request #7808 · Arize-ai/phoenixGitHub
Logo
Experiments CLI output
Releases · Arize-ai/phoenixGitHub
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
Logo
tracing
projects docs
tracing documentation
here
Data Retention
SmolagentsInstrumentor
LiteLLM Proxy
Cleanlabs evals
here
projects docs
projects
Logo