Skip to main content

Evals + Traces

Running evals on traces and logging them to Phoenix. Logging evals as annotations in Phoenix.

Evals + Experiments

How to run experiment evaluators.
All arize-phoenix-evals Evaluators are drop-in compatible with experiments.

Evals + Prompt Management (Python)

If your evaluation prompt is versioned in Phoenix Prompt Management, you can fetch it with phoenix-client and convert it into an eval-ready PromptTemplate.
from phoenix.client import Client
from phoenix.evals import (
    ClassificationEvaluator,
    LLM,
    phoenix_prompt_to_prompt_template,
)

client = Client(base_url="http://localhost:6006")
prompt_version = client.prompts.get(prompt_identifier="test-prompt")

prompt_template = phoenix_prompt_to_prompt_template(prompt_version)

evaluator = ClassificationEvaluator(
    name="response_quality",
    llm=LLM(provider="anthropic", model="claude-sonnet-4-6"),
    prompt_template=prompt_template,
    choices=["good", "bad"],
)
This keeps your eval logic aligned with prompt versions managed in Phoenix while still using the standard arize-phoenix-evals API.