You’re building a customer support agent called SupportBot. It classifies incoming queries, routes them to the right handler, and responds appropriately. But when something goes wrong, you’re left wondering: Why did the agent choose that tool instead of this one? What context was actually passed to the LLM when it generated that response?Without observability, debugging LLM applications feels like flying blind. In this chapter, you’ll learn how to instrument SupportBot with Arize AX to capture complete execution traces that answer these questions.
Next, set up the connection to Arize AX. This is where all your traces will be sent.
from arize.otel import registerfrom openinference.instrumentation.openai import OpenAIInstrumentorimport openaifrom opentelemetry import trace# Connect to Arize AXtracer_provider = register( space_id="your-space-id", # From Settings → API Keys api_key="your-api-key", # From Settings → API Keys project_name="supportbot-tutorial",)# Instrument OpenAI to automatically capture tracesOpenAIInstrumentor().instrument(tracer_provider=tracer_provider)# Initialize OpenAI client and traceropenai_client = openai.OpenAI()tracer = trace.get_tracer(__name__)
That’s it! With just these few lines, OpenInference will automatically
capture traces for all your OpenAI calls. The instrumentation handles all the
OpenTelemetry boilerplate for you.
Let’s start simple. Here’s how to classify a user query:
def classify_query(user_message: str) -> str: """Classify if the query is about order status or a general FAQ.""" with tracer.start_as_current_span("classify-query") as span: span.set_attribute("openinference.span.kind", "CHAIN") span.set_attribute("input.value", user_message) response = openai_client.chat.completions.create( model="gpt-4", messages=[ { "role": "system", "content": "You are a classifier. Respond with either 'ORDER_STATUS' or 'FAQ'." }, {"role": "user", "content": user_message} ], ) classification = response.choices[0].message.content.strip() span.set_attribute("output.value", classification) return classification# Try it outresult = classify_query("Where is my order #12345?")print(f"Classification: {result}")
Now let’s add the ability to look up order status. We’ll use OpenAI’s function calling:
import jsondef get_order_status(order_id: str) -> dict: """Simulate looking up an order in a database.""" return { "order_id": order_id, "status": "In Transit", "estimated_delivery": "2024-03-15" }def handle_order_query(user_message: str) -> str: """Handle order status queries using tool calling.""" with tracer.start_as_current_span("handle-order-query") as span: span.set_attribute("openinference.span.kind", "CHAIN") span.set_attribute("input.value", user_message) tools = [ { "type": "function", "function": { "name": "get_order_status", "description": "Look up the status of a customer order", "parameters": { "type": "object", "properties": { "order_id": { "type": "string", "description": "The order ID (e.g., '12345')" } }, "required": ["order_id"] } } } ] response = openai_client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": user_message}], tools=tools, ) # Check if the model wants to call a tool if response.choices[0].message.tool_calls: tool_call = response.choices[0].message.tool_calls[0] # Execute the tool with tracing with tracer.start_as_current_span("execute-tool") as tool_span: tool_span.set_attribute("openinference.span.kind", "TOOL") tool_span.set_attribute("tool.name", tool_call.function.name) tool_span.set_attribute("tool.parameters", tool_call.function.arguments) args = json.loads(tool_call.function.arguments) result = get_order_status(args["order_id"]) tool_span.set_attribute("tool.result", json.dumps(result)) # Generate final response with tool result final_response = openai_client.chat.completions.create( model="gpt-4", messages=[ {"role": "user", "content": user_message}, response.choices[0].message, { "role": "tool", "tool_call_id": tool_call.id, "content": json.dumps(result) } ], ) span.set_attribute("output.value", final_response.choices[0].message.content) return final_response.choices[0].message.content return response.choices[0].message.content# Try it outresponse = handle_order_query("What's the status of order 12345?")print(response)
Now in Arize AX, you’ll see a complete trace tree showing:
The parent “handle-order-query” span
The LLM call that decided to use a tool
The “execute-tool” span with parameters and results