Instructor is a Python library that makes it easy to get structured data from LLMs. This guide shows how to instrument your Instructor application using OpenInference to send trace data to Arize for observability, allowing you to see both the Instructor-specific operations and the underlying LLM calls.
API Key Setup
Before running your application, ensure you have the following environment variables set:
export ARIZE_SPACE_ID="YOUR_ARIZE_SPACE_ID"export ARIZE_API_KEY="YOUR_ARIZE_API_KEY"export OPENAI_API_KEY="YOUR_OPENAI_API_KEY"# Needed for the OpenAI example
You can find your Arize Space ID and API Key in your Arize account settings.
Install
Install Instructor, its OpenInference instrumentor, the instrumentor for the underlying LLM client (e.g., OpenAI), Arize OTel, and supporting OpenTelemetry packages:
Remember to install the OpenInference instrumentor for the specific LLM client library you are using with Instructor (e.g., openinference-instrumentation-openai for OpenAI, openinference-instrumentation-anthropic for Anthropic, etc.).
Setup Tracing
Connect to Arize using arize.otel.register and apply the InstructorInstrumentor as well as the instrumentor for your LLM client (e.g., OpenAIInstrumentor).
import osfrom arize.otel import registerfrom openinference.instrumentation.instructor import InstructorInstrumentorfrom openinference.instrumentation.openai import OpenAIInstrumentor # Or your LLM client's instrumentor# Ensure your API keys are set as environment variables# ARIZE_SPACE_ID = os.getenv("ARIZE_SPACE_ID")# ARIZE_API_KEY = os.getenv("ARIZE_API_KEY")# OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") # For the example# Setup OTel via Arize's convenience functiontracer_provider =register( space_id=os.getenv("ARIZE_SPACE_ID"), api_key=os.getenv("ARIZE_API_KEY"), project_name="my-instructor-app"# Choose a project name)# Instrument InstructorInstructorInstrumentor().instrument(tracer_provider=tracer_provider)# Instrument the underlying LLM clientOpenAIInstrumentor().instrument(tracer_provider=tracer_provider)# Example for OpenAIprint("Instructor and OpenAI client instrumented for Arize.")
Run Instructor Example
Now you can use Instructor as you normally would. The instrumentors will capture traces.
import instructor
from pydantic import BaseModel
from openai import OpenAI # Ensure OPENAI_API_KEY is set
# Define your desired output structure
class UserInfo(BaseModel):
name: str
age: int
# Patch the OpenAI client with Instructor
# The OpenAI client itself will be instrumented by OpenAIInstrumentor
# InstructorInstrumentor will trace the .create call patched by instructor.from_openai
client = instructor.from_openai(OpenAI())
# Extract structured data
user_info_response = client.chat.completions.create(
model="gpt-3.5-turbo",
response_model=UserInfo, # Instructor specific
messages=[{"role": "user", "content": "John Doe is 30 years old."}]
)
print(f"Name: {user_info_response.name}")
print(f"Age: {user_info_response.age}")
# Example with validation error
try:
invalid_user_info = client.chat.completions.create(
model="gpt-3.5-turbo",
response_model=UserInfo,
messages=[{"role": "user", "content": "The user is Jane."}], # Age is missing
max_retries=1 # Optional: limit retries for demonstration
)
except Exception as e:
print(f"Failed to extract valid UserInfo: {e}")
Observe in Arize
After running your Instructor application, traces will be sent to your Arize project. You can then log in to Arize to:
Visualize the calls made via Instructor.
Inspect the structured data extraction process, including any retries or validation errors if captured by Instructor's spans.
See the underlying LLM calls made by Instructor, along with their inputs and outputs.
Analyze latency and errors for both Instructor operations and LLM calls.