1 of 1

Google Gen AI Evals

GoogleGenAIModel

Need to install the extra dependency google-genai>=1.0.0

class GoogleGenAIModel:
    model: str = "gemini-2.5-flash"
    """The model name to use."""
    vertexai: Optional[bool] = None
    """Whether to use VertexAI instead of the Developer API."""
    api_key: Optional[str] = None
    """Your Google API key. If not provided, will be read from environment variables."""
    credentials: Optional["Credentials"] = None
    """Google Cloud credentials for VertexAI access."""
    project: Optional[str] = None
    """Google Cloud project ID for VertexAI."""
    location: Optional[str] = None
    """Google Cloud location for VertexAI."""
    initial_rate_limit: int = 5
    """Initial rate limit for API calls per second."""

The GoogleGenAIModel provides access to Google's Gemini models through the Google GenAI SDK. This is Google's recommended approach for accessing Gemini models as of late 2024, providing a unified interface for both the Developer API and VertexAI.

Key Features

Multimodal Support: Supports text, image, and audio inputs
Async Support: Fully async-compatible for high-throughput evaluations
Flexible Authentication: Works with both API keys and VertexAI credentials
Rate Limiting: Built-in dynamic rate limiting with automatic adjustment

Authentication Options

Option 1: Using API Key (Developer API)

Set the GOOGLE_API_KEY or GEMINI_API_KEY environment variable:

export GOOGLE_API_KEY=your_api_key_here

from phoenix.evals import GoogleGenAIModel

# API key will be read from environment
model = GoogleGenAIModel()

Option 2: Using VertexAI

model = GoogleGenAIModel(
    vertexai=True,
    project="your-project-id",
    location="us-central1"
)

Basic Usage

from phoenix.evals import GoogleGenAIModel

# Initialize with default settings
model = GoogleGenAIModel(model="gemini-2.5-flash")

# Simple text generation
response = model("What is the capital of France?")
print(response)  # "The capital of France is Paris."

Multimodal Usage

Image Input:

import base64
from phoenix.evals.templates import MultimodalPrompt, PromptPart, PromptPartContentType

# Load and encode an image
with open("image.jpg", "rb") as f:
    image_bytes = f.read()
image_base64 = base64.b64encode(image_bytes).decode("utf-8")

# Create multimodal prompt
prompt = MultimodalPrompt(
    parts=[
        PromptPart(content_type=PromptPartContentType.TEXT, content="What's in this image?"),
        PromptPart(content_type=PromptPartContentType.IMAGE, content=image_base64)
    ]
)

response = model._generate(prompt=prompt)
print(response)

Audio Input:

# Load and encode audio
with open("audio.wav", "rb") as f:
    audio_bytes = f.read()
audio_base64 = base64.b64encode(audio_bytes).decode("utf-8")

prompt = MultimodalPrompt(
    parts=[
        PromptPart(content_type=PromptPartContentType.AUDIO, content=audio_base64)
    ]
)

response = model._generate(prompt=prompt)
print(response)

Supported Models

The GoogleGenAIModel supports all Gemini models available through the Google GenAI SDK, including:

gemini-2.5-flash (default)
gemini-2.5-flash-001
gemini-2.0-flash-001
gemini-1.5-pro
gemini-1.5-flash

Supported File Formats

Images: PNG, JPEG, WebP, HEIC, HEIF Audio: WAV, MP3, AIFF, AAC, OGG, FLAC

We acknowledge Siddharth Sahu for this valuable contribution and support.