Agent Cookbooks
Tracing and Evaluating Agents

Agent Cookbook
Build a customer support agent to trace activity, assess performance, and experiment with prompts and models.

Evaluate an Agent
Trace and evaluate a "talk-to-your-data" agent. Includes evaluations for function calling accuracy, SQL query generation, code generation, and agent execution path.

OpenAI Agents SDK Cookbook
Create an agent with the OpenAI Agents SDK, trace its activity, benchmark with datasets, run experiments, and evaluate traces in production.

Using Ragas to Evaluate a Math Problem-Solving Agent
Create an agent using the OpenAI Agents SDK, trace its interactions, and evaluate performance using Ragas.
Last updated
Was this helpful?