Input Mapping

Evaluators are defined with a specific input schema, and the input payload is expected to take a certain shape. However, the input data is not always structured properly, so evaluators can be bound with an optional input_mapping which map/transforms the input to the shape they require. The powerful input mapping capabilities allow you to extract and transform data from complex nested structures.

Summary

Use input_mapping to map/transform evaluator-required field names to your input data.
You can bind an input_mapping to an evaluator for reuse with multiple inputs using .bind or bind_evaluator

Why do evaluators accept a payload and an input_mapping vs. kwargs?

Different evaluators require different keyword arguments to operate. These arguments may not perfectly match those in your example or dataset.

Let's say our example looks like this, where the inputs and outputs contain nested values:

eval_input = {
	"input": {
		"query": "user input query",
		"documents": ["doc A", "doc B"]
	},
	"output": {"response": "model answer"},
	"expected": "correct answer"
}

type EvalInput = {
  input: {
    query: string;
    documents: string[];
  };
  output: { response: string };
  expected: string;
};

const evalInput: EvalInput = {
  input: {
    query: "user input query",
    documents: ["doc A", "doc B"],
  },
  output: { response: "model answer" },
  expected: "correct answer",
};

We want to run two evaluators over this example:

Hallucination, which requires query, context, and response
exact_match, which requires expected and output

Rather than modifying our data to fit the two evaluators, we make the evaluators fit the data.

Binding an input_mapping enables the evaluators to run on the same payload - the map/transform steps are handled by the evaluator itself.

# define an input_mapping to map inputs required by hallucination evaluator to our data
input_mapping = {
    "input": "input.query",  # dot notation to access nested keys
    "output": "output.responses[0]",  # brackets to access list elements
    "context": lambda x: " ".join(
        x["output"]["documents"]
    ),  # lambda function to combine the document chunks
}
# the evaluator uses the input_mapping to transform the eval_input into the expected input schema
result = hallucination_evaluator.evaluate(eval_input, input_mapping)

import { bindEvaluator, createHallucinationEvaluator } from "@arizeai/phoenix-evals";
import { openai } from "@ai-sdk/openai";

type EvalInput = {
  input: {
    query: string;
    documents: string[];
  };
  output: { response: string };
  expected: string;
};

// Define an input mapping to map inputs required by hallucination evaluator to our data
const evaluator = bindEvaluator<EvalInput>(
  createHallucinationEvaluator({ model: openai("gpt-4") }),
  {
    inputMapping: {
      input: "input.query",  // dot notation to access nested keys
      output: "output.response",  // dot notation to access nested keys
      reference: (data) => data.input.documents.join(" "),  // function to combine document chunks
    },
  }
);

// The evaluator uses the inputMapping to transform the evalInput into the expected input schema
const result = await evaluator.evaluate({
  input: {
    query: "user input query",
    documents: ["doc A", "doc B"],
  },
  output: { response: "model answer" },
  expected: "correct answer",
});

Input Mapping Types

The input_mapping parameter accepts several types of mappings:

Simple key mapping: {"field": "key"} - maps evaluator field to input key
Path mapping: {"field": "nested.path"} - uses JSON path syntax from jsonpath
Callable mapping: {"field": lambda x: x["key"]} - custom extraction logic

Path Mapping Examples

# Nested dictionary access
input_mapping = {
    "query": "input.query",
    "context": "input.documents",
    "response": "output.answer"
}

# Array indexing
input_mapping = {
    "first_doc": "input.documents[0]",
    "last_doc": "input.documents[-1]"
}

# Combined nesting and list indexing
input_mapping = {
    "user_query": "data.user.messages[0].content",
}

// Nested dictionary access
const inputMapping = {
  query: "input.query",
  context: "input.documents",
  response: "output.answer",
};

// Array indexing
const inputMapping = {
  firstDoc: "input.documents[0]",
  lastDoc: "input.documents[-1]",
};

// Combined nesting and list indexing
const inputMapping = {
  userQuery: "data.user.messages[0].content",
};

Callable Mappings

For complex transformations, use callable functions that accept an eval_input payload:

# Callable example
def extract_context(eval_input):
    docs = eval_input.get("input", {}).get("documents", [])
    return " ".join(docs[:3])  # Join first 3 documents

input_mapping = {
    "query": "input.query",
    "context": extract_context,
    "response": "output.answer"
}

# Lambda example
input_mapping = {
    "user_query": lambda x: x["input"]["query"].lower(),
    "context": lambda x: " ".join(x["documents"][:3])
}

type EvalInput = {
  input: {
    query: string;
    documents: string[];
  };
  output: { answer: string };
};

// Function-based mapping example
function extractContext(evalInput: EvalInput): string {
  const docs = evalInput.input?.documents || [];
  return docs.slice(0, 3).join(" ");  // Join first 3 documents
}

const inputMapping = {
  query: "input.query",
  context: extractContext,
  response: "output.answer",
};

// Arrow function example
const inputMapping = {
  userQuery: (data: EvalInput) => data.input.query.toLowerCase(),
  context: (data: EvalInput) => data.input.documents.slice(0, 3).join(" "),
};

Pydantic Input Schemas

Python Evaluators use Pydantic models for input validation and type safety. Most of the time (e.g. for ClassificationEvaluator or functions decorated with create_evaluator), the input schema is inferred. But, you can always define your own. The Pydantic model allows you to annotate input fields with additional information such as aliases or descriptions.

from pydantic import BaseModel
from typing import List

class HallucinationInput(BaseModel):
    query: str
    context: List[str]
    response: str

evaluator = HallucinationEvaluator(
    name="hallucination",
    llm=llm,
    prompt_template="...",
    input_schema=HallucinationInput
)

Schema Inference

Most evaluators automatically infer schemas if not provided at instantiation.

LLM evaluators infer schemas from prompt templates:

# This creates a schema with required str fields: query, context, response
evaluator = LLMEvaluator(
    name="hallucination",
    llm=llm,
    prompt_template="Query: {query}\nContext: {context}\nResponse: {response}"
)

import { createHallucinationEvaluator } from "@arizeai/phoenix-evals";
import { openai } from "@ai-sdk/openai";

import { createClassificationEvaluator } from "@arizeai/phoenix-evals";
import { openai } from "@ai-sdk/openai";

// Define the structure of input records your evaluator will receive
type MyEvalInput = {
  input: {
    query: string;
    documents: string[];
  };
  output: { response: string };
  expected: string;
};

// Construct a custom evaluator using the factory method
const myCustomEvaluator = createClassificationEvaluator<MyEvalInput>({
  name: "my_custom_eval",
  model: openai("gpt-4"),
  promptTemplate: `
    Determine if the following response correctly answer's the user query using the provided context.
    Query: {input.query}
    Context: {input.documents}
    Response: {output.response}
  `,
  choices: ["Correct", "Incorrect"],
  // ...other options, scoring logic, etc. can be set here
});

// View the actual required fields to make sure they align with the type
console.log(myCustomEvaluator.promtTemplateVariables)

Decorated function evaluators infer schemas from the function signature:

@create_evaluator(name="exact_match")
def exact_match(output: str, expected: str) -> Score:
  ...
# creates input_schema with required str fields: output, expected
{'properties': {
  'output': {'title': 'Output','type': 'string'},
  'expected': {'title': 'Expected', 'type': 'string'}
  },
  'required': ['output', 'expected']
}

Binding System

Use bind_evaluator or .bind to create a pre-configured evaluator with a fixed input mapping. At evaluation time, you only need to provide the eval_input and the mapping is handled internally.

from phoenix.evals import bind_evaluator

# Create a bound evaluator with fixed mapping
bound_evaluator = bind_evaluator(
    evaluator,
    {
        "query": "input.query",
        "context": "input.documents",
        "response": "output.answer"
    }
)

# Run evaluation
scores = bound_evaluator({
    "input": {"query": "How do I reset?", "documents": ["Manual", "Guide"]},
    "output": {"answer": "  Go to settings > reset.  "}
})

import { bindEvaluator, createHallucinationEvaluator } from "@arizeai/phoenix-evals";
import { openai } from "@ai-sdk/openai";

type EvalInput = {
  input: {
    query: string;
    documents: string[];
  };
  output: { answer: string };
};

// Create a bound evaluator with fixed mapping
const boundEvaluator = bindEvaluator<EvalInput>(
  createHallucinationEvaluator({ model: openai("gpt-4") }),
  {
    inputMapping: {
      input: "input.query",
      reference: "input.documents",
      output: "output.answer",
    },
  }
);

// Run evaluation
const scores = await boundEvaluator.evaluate({
  input: { query: "How do I reset?", documents: ["Manual", "Guide"] },
  output: { answer: "  Go to settings > reset.  " },
});

PreviousEvaluators NextHow to: Evals

Last updated 1 day ago

Was this helpful?