Changelog

See the latest new features released in Arize

Realtime Trace Ingestion for All Arize AX Instances

May 20, 2025

Realtime trace ingestion is now supported across all Arize AX tiers, including the free tier.

Previously, this feature was only available for enterprise AX users and within our open-source platform, Phoenix. It is now fully rolled out to all users of Arize AX.

No configuration changes are required to begin using realtime trace ingestion.

More OpenAI models in prompt playground and tasks

May 11, 2025

We've added support for more OpenAI models in prompt playground and evaluation tasks. Experiment across models and frameworks quickly.

Sleeker display of inputs and outputs on a span

May 9, 2025

We've improved the design of the span page to showcase the functions, inputs, and outputs, to help you debug your traces faster!

Attribute search on traces

May 7, 2025

Now you can filter your span attributes right on the page, no more CMD+F !

Column selection in prompt playground

May 5, 2025

You can now view all of your prompt variables and dataset values directly in playground!

Latency and token counts in prompt playground

May 2, 2025

We've added latency and token counts to prompt playground runs! Currently supported for OpenAI, with more providers to come!

Major design refresh in Arize AX

April 28, 2025

We've refreshed Arize AX with polished fonts, spacing, color, and iconography throughout the whole platform.

Custom code evaluators

April 26, 2025

You can now run your own custom python code evaluators in Arize against your data in a secure environment. Use background tasks to run any custom code, such as URL validations, or keyword match. Learn more

Security audit logs for enterprise customers

April 25, 2025

Improve your compliance and policy adherence. You can now use audit logs to monitor data access in Arize. Note: This feature is completely opt-in and this tracking is not enabled unless a customer explicitly asks for it. Learn more

Larger dataset runs in prompt playground

April 24, 2025

We've increased the row limit for datasets in the playground, so you can run prompts in parallel on up to 100 examples.

Evaluations on experiments

April 24, 2025

You can now create and run evals on your experiments from the UI. Compare performance across different prompt templates, models, or configurations without code. Learn more →

Cancel running background tasks

April 24, 2025

When running evaluations using background tasks, you can now cancel them mid-flight while observing task logs. Learn more →

Improved UI for functions in prompt playground

April 21, 2025

We've made it easier to view, test, and validate your tool calls in prompt playground. Learn more →

Compare prompts side by side

April 15, 2025

Compare the outputs of a new prompt and the original prompt side-by-side. Tweak model parameters and compare results across your datasets. Learn more →

Image segmentation support for CV models

April 14, 2025

We now support logging image segmentation to Arize. Log your segmentation coordinates and compare your predictions vs. your actuals.

Learn more →

New time selector on your traces

April 11, 2025

We’ve made it way easier to drill into specific time ranges, with quick presets like "last 15 minutes" and custom shorthand for specific dates and times, such as 10d ,4/1 - 4/6, 4/1 3:00am . Learn more →

Prompt hub python SDK

April 7, 2025

Access and manage your prompts in code with support for OpenAI and VertexAI. Learn more

pip install "arize[PromptHub]"

View task run history and errors

April 4, 2025

Get full visbility into your evaluation task runs, including when it ran, what triggered it, and if there were errors. Learn more →

Run evals and tasks over a date range

April 2, 2025

Easily run your online evaluation tasks over historical data.

Test online evaluation tasks in playground

March 24, 2025

Quickly debug and refine your prompts used by your online evaluators by loading them prefilled into prompt playground. Learn more →

Select metadata on the sessions page

March 1, 2025

Dynamically select the fields you want to see in your sessions view.

Labeling queues

February 27, 2025

Use Arize to annotate your data with 3rd parties. Learn more →

Expand and collapse your traces

February 20, 2025

You can now collapse rows to see more data at a glance or expand them to view more text.

Schedule your monitors

February 14, 2025

Schedule for monitors to run hourly, daily, weekly, or monthly.

Improved traces export

February 14, 2025

Specify which columns of data you'd like to export when exporting data via the ArizeExportClient by specifying columns .

primary_df = client.export_model_to_df(
    columns=['context.span_id', 'attributes.llm.input'] # <---- HERE
    space_id='',
    model_id='',
    environment=Environments.TRACING,
    start_time=datetime(2025, 3, 25),
    end_time=datetime(2025, 4, 25),
)

Create dataset from CSVs

February 14, 2025

You can now create datasets through many methods, from traces, code, manually in the UI, or CSV upload. Read more

OTEL tracing Via HTTP

February 14, 2025

Support for HTTP when sending traces to Arize! See GitHub for more info.

tracer_provider = register(
    endpoint="https://otlp.arize.com/v1/traces",     # NEW
    transport=Transport.HTTP,                        # NEW
    space_id=SPACE_ID,
    api_key=API_KEY
    project_name="test-project-http",
)

Voice application tracing and evaluation

January 21, 2025

Audio tracing: Capture, process, and send audio data to Arize and observe your application behavior.

Evaluation: Assess how well your models identify emotional tones like frustration, joy, or neutrality.

Voice App Tracing

Dashboard colors

January 21, 2025

We’ve added new ways to plot your charts, with custom colors and better UX!

Prompt hub

December 19, 2024

Manage, iterate, and deploy your prompts in one place. Version control your templates and use them across playground, tasks, and APIs. Read more

Managed code evaluators

December 19, 2024

Use our pre-built, off-the-shelf evaluators to evaluate spans without requiring requests to an LLM-as-a-Judge. These include Regex matching, JSON validation, Contains keyword, and more!

Create experiments from playground

December 19, 2024

Quickly experiment with your prompts across your datasets. All you have to do is click "Save as experiment" Read more

Monitor alert status

December 19, 2024

See exactly how and when your monitors are triggered

LangChain Instrumentation

December 19, 2024

Support for sessions via LangChain native thread tracking in TypeScript is now available. Easily track multi-turn conversations / threads using LangChain.js.

Analyze your spans with Copilot

December 05, 2024

Extract key insights quickly from your spans instead of trying to decipher meaning in hundreds of spans. Ask questions and run evals right in the trace view.

Span Chat Evaluation

Generate dashboards with Copilot

December 05, 2024

Building dashboard plots just got way easier. Create time series plots and even translate code into ready to go visualizations.

Dashboard generator

The Custom Metric skill now supports a conversational flow, making it easier for users to iterate and refine metrics dynamically

View your experiment traces

December 05, 2024

Experiment traces for a dataset are now consolidated accessed under "Experiment Projects".

Experiment Projects

Multi-class calibration chart

December 05, 2024

For your multi-class ML models, you can see how your model is calibrated in one visualization

Calibration Chart

Log experiments in Python SDK

December 05, 2024

You can now log experiment data manually using a dataframe, instead of running an experiment. This is useful if you already have the data you need, and re-running the query would be expensive. SDK Reference

arize_client.log_experiment(
    space_id=SPACE_ID,
    experiment_name="my_experiment",
    experiment_df=experiment_run_df,
    task_columns=task_columns,
    evaluator_columns={"correctness": evaluator_columns},
    dataset_name=dataset_name,
)

Create custom metrics with Copilot

November 07, 2024

Users can generate their desired metric by having copilot translate natural language descriptions or existing code (e.g., SQL, Python) into AQL. Learn more →

Copilot Custom Metric Skill

Summarize embeddings with Copilot

November 07, 2024

Copilot now works for embeddings! Users can select embedding data point and Copilot will analyze for patterns and insights. Learn more →

Copilot Embedding Summarization Skill

Local explainability support for ML models

November 07, 2024

Local Explainability is now live, providing both a table view and waterfall style plot for detailed, per-feature SHAP values on individual predictions. Learn more →

Local Explainability Support

See experiment results over time

November 07, 2024

Visualize specific evaluations over time in dashboards. Learn more →

Experiment Over Time Widget

Function calling replay in prompt playground

November 07, 2024

Now users can follow the full function calling tutorial from OpenAI and iterate on different functions in different messages from within the Prompt Playground.

Full Function Calling Replay

Vercel AI auto-instrumentation

November 07, 2024

User can now ingest traces created by the Vercel AI SDK into Arize. Learn more →

Track sessions and context attributes in instrumentation

November 07, 2024

You can add metadata and context that will be picked up by all of our auto instrumentations and added to spans. Learn more →

Easily test your online tasks and evals

October 24, 2024

Users now have the option to to test a task, such as online eval, by running it once on existing data, or apply evaluation labels to older traces. Learn more →

Experiment filters

October 24, 2024

Users can now filter experiments based on dataset attributes or experiment results, making it easy to identify areas for improvement and track their experiment progress with more precision. Learn more →

Filtering experiments by experiment name

Embedding traces

October 03, 2024

With Embeddings Tracing, you can effortlessly select embedding spans and dive straight into the UMAP visualizer, simplifying troubleshooting for your genAI applications. Learn more →

Embedding traces in action

Experiments Details Visualization

October 03, 2024

Users can now view a detailed breakdown of labels for their experiments on the Experiments Details page.

Experiments details visualization

Support for o1-mini and o1-preview in playground

October 03, 2024

We've added full support for all available OpenAI models in the playground including the o1-mini and o1-preview.

Improved auto-complete in playground

October 03, 2024

We've added better input variable behavior, autocompletion enhancements, support for mustache/f-string input variables, and more.

Filter history

October 03, 2024

We now store the last three filters used by a user! Users can easily access their filter history in the query filters dropdown, making it simpler to reuse filters for future queries.

Filters history

Tracing quick filters

October 03, 2024

Apply filters directly from the table by hovering over the text to reveal the filter icon.

Quick filters

New arize-otel package

October 03, 2024

We made it way simpler to add automatic tracing to your applications! It's now just a few lines of code to use OpenTelemetry to trace your LLM application. Check out our new quickstart guide which uses our arize-otel package.

Easily add spans to datasets

October 03, 2024

Easily add spans to a dataset from the Traces page using the "Add to Dataset" button.

"Add to Dataset" & "Setup Task" buttons

See more

2024202320222021

Last updated

Was this helpful?