> ## Documentation Index
> Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Vertex AI

> Use Google Vertex AI as the judge LLM for evaluations with the arize-phoenix-evals library.

The [`arize-phoenix-evals`](https://pypi.org/project/arize-phoenix-evals/) library uses an LLM-as-judge to grade model output — hallucinations, factuality, helpfulness, toxicity, custom rubrics. Plug Vertex AI in as the judge by passing `provider="vertex"` to the `LLM(...)` wrapper, then build a `create_classifier(...)` evaluator and run it over a DataFrame with `evaluate_dataframe(...)`.

## Prerequisites

* Python 3.11+
* A Google Cloud project with the [Vertex AI API enabled](https://console.cloud.google.com/apis/library/aiplatform.googleapis.com)
* A service account or user with the `roles/aiplatform.user` IAM role
* Authenticated Application Default Credentials (`gcloud auth application-default login`) or a service account JSON file referenced by `GOOGLE_APPLICATION_CREDENTIALS`

## Install

```bash theme={null}
pip install arize-phoenix-evals litellm google-auth pandas
```

The `vertex` provider dispatches via the LiteLLM backend to the regional `aiplatform.googleapis.com` endpoint. `google-auth` is required so LiteLLM can resolve Application Default Credentials; without it the first eval call exits with `ModuleNotFoundError: No module named 'google'`.

## Configure credentials

Vertex AI uses Google Cloud auth, not an API key. Authenticate locally and tell the SDK which project/region to target:

```bash theme={null}
# Recommended for local dev — uses your gcloud user credentials.
gcloud auth application-default login

# Or, with a service account JSON file:
# export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"

export VERTEXAI_PROJECT="<your-gcp-project-id>"
export VERTEXAI_LOCATION="us-central1"  # optional; LiteLLM defaults to us-central1
```

`VERTEXAI_PROJECT` is mandatory — the SDK exits with `Could not resolve project_id` if it isn't set. `VERTEXAI_LOCATION` is optional and defaults to `us-central1`; set it explicitly when you need a different region (e.g. `europe-west1` for EU residency, or to match where the target model is enabled).

## Setup the eval LLM

```python theme={null}
# eval_setup.py
from phoenix.evals import LLM

# `provider="vertex"` dispatches via the LiteLLM backend to the
# Vertex AI endpoint, picking up VERTEXAI_PROJECT, VERTEXAI_LOCATION,
# and Application Default Credentials from the environment.
llm = LLM(provider="vertex", model="gemini-2.5-flash")
```

`gemini-2.5-flash` is a strong default judge — fast and cheap relative to `gemini-2.5-pro`. The judge's job is classification, not generation, so a smaller model is often sufficient.

## Run an evaluation

This example builds a hallucination classifier and grades two sample question/answer pairs against a reference. The pattern generalizes: replace the prompt template, choices, and DataFrame columns with whatever metric you want to evaluate.

```python theme={null}
# example.py
import pandas as pd

from phoenix.evals import LLM, create_classifier, evaluate_dataframe

llm = LLM(provider="vertex", model="gemini-2.5-flash")

HALLUCINATION_PROMPT = """\
Determine whether the answer below is factually supported by the
reference. Reply with exactly one of: factual, hallucinated.

Question: {input}
Answer: {output}
Reference: {reference}
"""

evaluator = create_classifier(
    name="hallucination",
    prompt_template=HALLUCINATION_PROMPT,
    llm=llm,
    # `choices` maps each label the LLM may emit to a numeric score.
    # `direction="maximize"` (the default) means higher score is better.
    choices={"factual": 1.0, "hallucinated": 0.0},
)

df = pd.DataFrame([
    {
        "input":     "What is the capital of France?",
        "output":    "Paris is the capital of France.",
        "reference": "Paris is the capital and most populous city of France.",
    },
    {
        "input":     "What is the capital of France?",
        "output":    "Berlin is the capital of France.",
        "reference": "Paris is the capital and most populous city of France.",
    },
])

results = evaluate_dataframe(dataframe=df, evaluators=[evaluator])

# `hallucination_score` is a Score row (a dict-like with `score`, `label`,
# `explanation`, …) — pull the numeric out for a flat display column.
results["score"] = results["hallucination_score"].apply(lambda r: r["score"])
print(results[["input", "output", "score"]].to_string())
```

### Expected output

```text wrap theme={null}
                            input                            output  score
0  What is the capital of France?   Paris is the capital of France.    1.0
1  What is the capital of France?  Berlin is the capital of France.    0.0
```

The full returned DataFrame also includes `hallucination_execution_details` (status + exceptions + timing) and the original `hallucination_score` column with each evaluator result's full dict (`name`, `score`, `label`, `explanation`, `metadata`, `kind`, `direction`) — useful for surfacing the LLM's reasoning, persisting eval rows back to Arize AX, or filtering retries.

## Troubleshooting

* **`ModuleNotFoundError: No module named 'google'`.** The `google-auth` package isn't installed. Add it to your install line (`pip install ... google-auth ...`) — or, equivalently, install `litellm[google]` which pulls in the full `google-cloud-aiplatform` SDK plus its auth deps.
* **`Permission denied on resource project ...` / `PERMISSION_DENIED`.** The principal in your ADC doesn't have `roles/aiplatform.user` (or finer-grained Vertex permissions) on the project, or you authenticated with end-user credentials that have no quota project. Grant the role in the [IAM console](https://console.cloud.google.com/iam-admin/iam), then run `gcloud auth application-default set-quota-project <PROJECT_ID>`.
* **`Reauthentication needed` / expired credentials.** Run `gcloud auth application-default login` again, or rotate the service account key referenced by `GOOGLE_APPLICATION_CREDENTIALS`.
* **`Could not resolve project_id`.** `VERTEXAI_PROJECT` isn't set and ADC didn't surface a default project. Either export `VERTEXAI_PROJECT` explicitly or run `gcloud config set project <PROJECT_ID>` before `gcloud auth application-default login`.
* **`404 NOT_FOUND` for the model.** The model isn't available in the region you set for `VERTEXAI_LOCATION` (or in the default `us-central1` if you didn't set one). Check the [Vertex AI generative model availability matrix](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models) and swap regions accordingly.
* **All rows return the same label.** Your prompt template isn't differentiating cases. Make sure each row's `{input}`/`{output}`/`{reference}` columns expose enough context for the judge to discriminate, and that `choices` lists every label your prompt asks the LLM to emit.
* **Some rows fail with timeout / rate-limit.** Pass `max_retries=` to `evaluate_dataframe(...)` (defaults to 3). For large batches, also pass `initial_per_second_request_rate=...` to `LLM(...)` to throttle.
* **Logging results back to Arize AX.** This guide stops at producing the eval DataFrame. To attach those evals to existing spans in an Arize AX project, use [`log_evaluations_sync`](/ax/cookbooks/evaluation/evaluations-quickstart#log-evaluations-back-to-arize) on `arize.Client`.
* **Using the Gemini API instead of Vertex.** Set `GEMINI_API_KEY` and switch to `provider="google"` — see the [Gemini evals](/ax/integrations/llm-providers/google-gen-ai/gemini-evals) doc for the full pattern.

## Resources

<CardGroup>
  <Card icon="book-open" href="https://arize.com/docs/phoenix/evaluation/llm-evals" title="Phoenix Evals Documentation" horizontal />

  <Card icon="terminal" href="https://pypi.org/project/arize-phoenix-evals/" title="arize-phoenix-evals on PyPI" horizontal />

  <Card icon="github" href="https://github.com/Arize-ai/phoenix/tree/main/packages/phoenix-evals" title="Phoenix Evals Source" horizontal />

  <Card icon="book-open" href="/ax/integrations/llm-providers/vertexai/vertexai-tracing" title="Vertex AI Tracing (instrument app calls)" horizontal />
</CardGroup>
