Instructor Tracing

Instructor is a Python library that makes it easy to get structured data from LLMs. This guide shows how to instrument your Instructor application using OpenInference to send trace data to Arize for observability, allowing you to see both the Instructor-specific operations and the underlying LLM calls.

API Key Setup

Before running your application, ensure you have the following environment variables set:

export ARIZE_SPACE_ID="YOUR_ARIZE_SPACE_ID"
export ARIZE_API_KEY="YOUR_ARIZE_API_KEY"
export OPENAI_API_KEY="YOUR_OPENAI_API_KEY" # Needed for the OpenAI example

You can find your Arize Space ID and API Key in your Arize account settings.

Install

Install Instructor, its OpenInference instrumentor, the instrumentor for the underlying LLM client (e.g., OpenAI), Arize OTel, and supporting OpenTelemetry packages:

pip install instructor openinference-instrumentation-instructor openinference-instrumentation-openai arize-otel opentelemetry-sdk opentelemetry-exporter-otlp

Remember to install the OpenInference instrumentor for the specific LLM client library you are using with Instructor (e.g., openinference-instrumentation-openai for OpenAI, openinference-instrumentation-anthropic for Anthropic, etc.).

Setup Tracing

Connect to Arize using arize.otel.register and apply the InstructorInstrumentor as well as the instrumentor for your LLM client (e.g., OpenAIInstrumentor).

import os
from arize.otel import register
from openinference.instrumentation.instructor import InstructorInstrumentor
from openinference.instrumentation.openai import OpenAIInstrumentor # Or your LLM client's instrumentor

# Ensure your API keys are set as environment variables
# ARIZE_SPACE_ID = os.getenv("ARIZE_SPACE_ID")
# ARIZE_API_KEY = os.getenv("ARIZE_API_KEY")
# OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") # For the example

# Setup OTel via Arize's convenience function
tracer_provider = register(
    space_id=os.getenv("ARIZE_SPACE_ID"),
    api_key=os.getenv("ARIZE_API_KEY"),
    project_name="my-instructor-app" # Choose a project name
)

# Instrument Instructor
InstructorInstrumentor().instrument(tracer_provider=tracer_provider)
# Instrument the underlying LLM client
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider) # Example for OpenAI

print("Instructor and OpenAI client instrumented for Arize.")

Run Instructor Example

Now you can use Instructor as you normally would. The instrumentors will capture traces.

import instructor
from pydantic import BaseModel
from openai import OpenAI # Ensure OPENAI_API_KEY is set

# Define your desired output structure
class UserInfo(BaseModel):
    name: str
    age: int

# Patch the OpenAI client with Instructor
# The OpenAI client itself will be instrumented by OpenAIInstrumentor
# InstructorInstrumentor will trace the .create call patched by instructor.from_openai
client = instructor.from_openai(OpenAI())

# Extract structured data
user_info_response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    response_model=UserInfo, # Instructor specific
    messages=[{"role": "user", "content": "John Doe is 30 years old."}]
)

print(f"Name: {user_info_response.name}")
print(f"Age: {user_info_response.age}")

# Example with validation error
try:
    invalid_user_info = client.chat.completions.create(
        model="gpt-3.5-turbo",
        response_model=UserInfo,
        messages=[{"role": "user", "content": "The user is Jane."}], # Age is missing
        max_retries=1 # Optional: limit retries for demonstration
    )
except Exception as e:
    print(f"Failed to extract valid UserInfo: {e}")

Observe in Arize

After running your Instructor application, traces will be sent to your Arize project. You can then log in to Arize to:

  • Visualize the calls made via Instructor.

  • Inspect the structured data extraction process, including any retries or validation errors if captured by Instructor's spans.

  • See the underlying LLM calls made by Instructor, along with their inputs and outputs.

  • Analyze latency and errors for both Instructor operations and LLM calls.

Resources

Last updated

Was this helpful?