class GoogleGenAIModel:
model: str = "gemini-2.5-flash"
"""The model name to use."""
vertexai: Optional[bool] = None
"""Whether to use VertexAI instead of the Developer API."""
api_key: Optional[str] = None
"""Your Google API key. If not provided, will be read from environment variables."""
credentials: Optional["Credentials"] = None
"""Google Cloud credentials for VertexAI access."""
project: Optional[str] = None
"""Google Cloud project ID for VertexAI."""
location: Optional[str] = None
"""Google Cloud location for VertexAI."""
initial_rate_limit: int = 5
"""Initial rate limit for API calls per second."""
The GoogleGenAIModel
provides access to Google's Gemini models through the Google GenAI SDK. This is Google's recommended approach for accessing Gemini models as of late 2024, providing a unified interface for both the Developer API and VertexAI.
Key Features
Multimodal Support: Supports text, image, and audio inputs
Async Support: Fully async-compatible for high-throughput evaluations
Flexible Authentication: Works with both API keys and VertexAI credentials
Rate Limiting: Built-in dynamic rate limiting with automatic adjustment
Authentication Options
Option 1: Using API Key (Developer API)
Set the GOOGLE_API_KEY
or GEMINI_API_KEY
environment variable:
export GOOGLE_API_KEY=your_api_key_here
from phoenix.evals import GoogleGenAIModel
# API key will be read from environment
model = GoogleGenAIModel()
Option 2: Using VertexAI
model = GoogleGenAIModel(
vertexai=True,
project="your-project-id",
location="us-central1"
)
Basic Usage
from phoenix.evals import GoogleGenAIModel
# Initialize with default settings
model = GoogleGenAIModel(model="gemini-2.5-flash")
# Simple text generation
response = model("What is the capital of France?")
print(response) # "The capital of France is Paris."
Multimodal Usage
Image Input:
import base64
from phoenix.evals.templates import MultimodalPrompt, PromptPart, PromptPartContentType
# Load and encode an image
with open("image.jpg", "rb") as f:
image_bytes = f.read()
image_base64 = base64.b64encode(image_bytes).decode("utf-8")
# Create multimodal prompt
prompt = MultimodalPrompt(
parts=[
PromptPart(content_type=PromptPartContentType.TEXT, content="What's in this image?"),
PromptPart(content_type=PromptPartContentType.IMAGE, content=image_base64)
]
)
response = model._generate(prompt=prompt)
print(response)
Audio Input:
# Load and encode audio
with open("audio.wav", "rb") as f:
audio_bytes = f.read()
audio_base64 = base64.b64encode(audio_bytes).decode("utf-8")
prompt = MultimodalPrompt(
parts=[
PromptPart(content_type=PromptPartContentType.AUDIO, content=audio_base64)
]
)
response = model._generate(prompt=prompt)
print(response)
Supported Models
The GoogleGenAIModel supports all Gemini models available through the Google GenAI SDK, including:
gemini-2.5-flash
(default)
gemini-2.5-flash-001
gemini-2.0-flash-001
gemini-1.5-pro
gemini-1.5-flash
Supported File Formats
Images: PNG, JPEG, WebP, HEIC, HEIF Audio: WAV, MP3, AIFF, AAC, OGG, FLAC