Function Calling Agent: Evaluation

This demo covers how to run custom LLM evaluations in Phoenix using an LLM judge approach for a function calling agent. It explains the process of setting up data frames, different evaluation methods like LLM generate and classify, and exporting results back to the Phoenix UI.

Follow along in the notebook and learn more about how to build a custom evaluator.

Arize AX

Learn

Insights

Company

Arize AX

Learn

Insights

Company

Videos

Function Calling Agent: Evaluation

LLM Evaluation: Everything You Need To Run, Benchmark LLM Evals

What is LLM Observability?

The Definitive LLM Observability Checklist

Arize AX

Learn

Insights

Company

Videos

Function Calling Agent: Evaluation

LLM Evaluation: Everything You Need To Run, Benchmark LLM Evals

What is LLM Observability?

The Definitive LLM Observability Checklist

Subscribe to The Evaluator