Skip to main content
@arizeai/phoenix-evals pairs with @arizeai/phoenix-client when you want to run evaluator-backed experiments and store both task traces and evaluation results in Phoenix.

Relevant Source Files

  • src/index.ts for the root evaluator exports
  • companion package: @arizeai/phoenix-client/datasets
  • companion package: @arizeai/phoenix-client/experiments

Example

import { openai } from "@ai-sdk/openai";
import { createFaithfulnessEvaluator } from "@arizeai/phoenix-evals";
import { createOrGetDataset } from "@arizeai/phoenix-client/datasets";
import {
  asExperimentEvaluator,
  runExperiment,
} from "@arizeai/phoenix-client/experiments";

await createOrGetDataset({
  name: "support-eval",
  examples: [
    {
      input: {
        question: "Is Phoenix open source?",
        context: "Phoenix is open source.",
      },
      output: {
        answer: "Phoenix is open source.",
      },
    },
  ],
});

const faithfulness = createFaithfulnessEvaluator({
  model: openai("gpt-4o-mini"),
});

await runExperiment({
  dataset: { datasetName: "support-eval" },
  task: async ({ question, context }) =>
    `${question} Answer using only this context: ${context}`,
  evaluators: [
    asExperimentEvaluator({
      name: "faithfulness",
      kind: "LLM",
      evaluate: async ({ input, output }) =>
        faithfulness.evaluate({
          input: String(input.question ?? ""),
          context: String(input.context ?? ""),
          output: String(output ?? ""),
        }),
    }),
  ],
});

What Each Package Does

  • @arizeai/phoenix-evals builds evaluator logic
  • @arizeai/phoenix-client handles experiment execution and persistence
  • combined usage produces evaluator traces and experiment results in Phoenix

Source Map

  • src/index.ts
  • src/llm/
  • companion package: @arizeai/phoenix-client/datasets
  • companion package: @arizeai/phoenix-client/experiments