Skip to main content

Building Visibility Into SupportBot

You’re building a customer support agent called SupportBot. It classifies incoming queries, routes them to the right handler, and responds appropriately. But when something goes wrong, you’re left wondering: Why did the agent choose that tool instead of this one? What context was actually passed to the LLM when it generated that response? Without observability, debugging LLM applications feels like flying blind. In this chapter, you’ll learn how to instrument SupportBot with Arize AX to capture complete execution traces that answer these questions.

What We’re Building

SupportBot has three core functions:
  1. Classify incoming queries (order status vs. FAQ)
  2. Route to the appropriate handler
  3. Execute either database lookups or RAG-based knowledge searches
By the end of this chapter, you’ll have complete visibility into every step of this process.

Follow with Complete Python Notebook

Prerequisites

Before starting, make sure you have:
  • An Arize AX account (sign up for free at app.arize.com)
  • Your Space ID and API Key (found in Settings → API Keys)
  • OpenAI API key (or another supported LLM provider)

Step 1: Install Dependencies

First, install the necessary packages for tracing:
pip install openinference-instrumentation-openai openai arize-otel

Step 2: Configure Tracing

Next, set up the connection to Arize AX. This is where all your traces will be sent.
from arize.otel import register
from openinference.instrumentation.openai import OpenAIInstrumentor
import openai
from opentelemetry import trace

# Connect to Arize AX
tracer_provider = register(
    space_id="your-space-id",  # From Settings → API Keys
    api_key="your-api-key",    # From Settings → API Keys
    project_name="supportbot-tutorial",
)

# Instrument OpenAI to automatically capture traces
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

# Initialize OpenAI client and tracer
openai_client = openai.OpenAI()
tracer = trace.get_tracer(__name__)
That’s it! With just these few lines, OpenInference will automatically capture traces for all your OpenAI calls. The instrumentation handles all the OpenTelemetry boilerplate for you.

Step 3: Trace Your First LLM Call

Let’s start simple. Here’s how to classify a user query:
def classify_query(user_message: str) -> str:
    """Classify if the query is about order status or a general FAQ."""

    with tracer.start_as_current_span("classify-query") as span:
        span.set_attribute("openinference.span.kind", "CHAIN")
        span.set_attribute("input.value", user_message)

        response = openai_client.chat.completions.create(
            model="gpt-4",
            messages=[
                {
                    "role": "system",
                    "content": "You are a classifier. Respond with either 'ORDER_STATUS' or 'FAQ'."
                },
                {"role": "user", "content": user_message}
            ],
        )

        classification = response.choices[0].message.content.strip()
        span.set_attribute("output.value", classification)

        return classification

# Try it out
result = classify_query("Where is my order #12345?")
print(f"Classification: {result}")

What Just Happened?

  1. Created a parent span named “classify-query” that groups related operations
  2. Automatically captured the OpenAI call details (the instrumentation did this)
  3. Set OpenInference attributes to help Arize understand the span type and data
  4. Sent everything to Arize AX for visualization
Head to your Arize AX dashboard, and you’ll see this trace appear in real-time!

Step 4: Add Tool Call Tracing

Now let’s add the ability to look up order status. We’ll use OpenAI’s function calling:
import json

def get_order_status(order_id: str) -> dict:
    """Simulate looking up an order in a database."""
    return {
        "order_id": order_id,
        "status": "In Transit",
        "estimated_delivery": "2024-03-15"
    }

def handle_order_query(user_message: str) -> str:
    """Handle order status queries using tool calling."""

    with tracer.start_as_current_span("handle-order-query") as span:
        span.set_attribute("openinference.span.kind", "CHAIN")
        span.set_attribute("input.value", user_message)

        tools = [
            {
                "type": "function",
                "function": {
                    "name": "get_order_status",
                    "description": "Look up the status of a customer order",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "order_id": {
                                "type": "string",
                                "description": "The order ID (e.g., '12345')"
                            }
                        },
                        "required": ["order_id"]
                    }
                }
            }
        ]

        response = openai_client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": user_message}],
            tools=tools,
        )

        # Check if the model wants to call a tool
        if response.choices[0].message.tool_calls:
            tool_call = response.choices[0].message.tool_calls[0]

            # Execute the tool with tracing
            with tracer.start_as_current_span("execute-tool") as tool_span:
                tool_span.set_attribute("openinference.span.kind", "TOOL")
                tool_span.set_attribute("tool.name", tool_call.function.name)
                tool_span.set_attribute("tool.parameters", tool_call.function.arguments)

                args = json.loads(tool_call.function.arguments)
                result = get_order_status(args["order_id"])

                tool_span.set_attribute("tool.result", json.dumps(result))

            # Generate final response with tool result
            final_response = openai_client.chat.completions.create(
                model="gpt-4",
                messages=[
                    {"role": "user", "content": user_message},
                    response.choices[0].message,
                    {
                        "role": "tool",
                        "tool_call_id": tool_call.id,
                        "content": json.dumps(result)
                    }
                ],
            )

            span.set_attribute("output.value", final_response.choices[0].message.content)
            return final_response.choices[0].message.content

        return response.choices[0].message.content

# Try it out
response = handle_order_query("What's the status of order 12345?")
print(response)
Now in Arize AX, you’ll see a complete trace tree showing:
  • The parent “handle-order-query” span
  • The LLM call that decided to use a tool
  • The “execute-tool” span with parameters and results
  • The final LLM call that formulated the response

Step 5: Add RAG Tracing

For FAQ queries, we’ll use a simple RAG pipeline. Let’s trace both the retrieval and generation steps:
def embed_text(text: str) -> list[float]:
    """Generate embedding for text."""

    with tracer.start_as_current_span("embed-text") as span:
        span.set_attribute("openinference.span.kind", "EMBEDDING")
        span.set_attribute("input.value", text)
        span.set_attribute("embedding.model_name", "text-embedding-3-small")
        span.set_attribute(
            "embedding.invocation_parameters",
            json.dumps({"model": "text-embedding-3-small"})
        )

        response = openai_client.embeddings.create(
            model="text-embedding-3-small",
            input=text
        )

        embedding = response.data[0].embedding

        span.set_attribute("embedding.embeddings.0.embedding.text", text)
        span.set_attribute("embedding.embeddings.0.embedding.vector", embedding)

        return embedding

def retrieve_documents(query: str, knowledge_base: list[str], top_k: int = 2) -> list[str]:
    """Retrieve relevant documents using embeddings."""

    with tracer.start_as_current_span("retrieve-documents") as span:
        span.set_attribute("openinference.span.kind", "RETRIEVER")
        span.set_attribute("input.value", query)

        # Generate query embedding (in production, use to search a vector DB)
        query_embedding = embed_text(query)
        span.set_attribute("retrieval.query_embedding_dims", len(query_embedding))

        # In production: search vector DB with query_embedding and return top_k
        retrieved = knowledge_base[:top_k]

        # Record retrieved documents using OpenInference semantic conventions
        for i, doc in enumerate(retrieved):
            span.set_attribute(f"retrieval.documents.{i}.document.id", str(i))
            span.set_attribute(f"retrieval.documents.{i}.document.content", doc)

        return retrieved

def handle_faq_query(user_message: str) -> str:
    """Handle FAQ queries using RAG."""

    # Sample knowledge base
    knowledge_base = [
        "We offer free shipping on orders over $50.",
        "Returns are accepted within 30 days of purchase.",
        "You can track your order using the tracking number in your confirmation email.",
    ]

    with tracer.start_as_current_span("handle-faq-query") as span:
        span.set_attribute("openinference.span.kind", "CHAIN")
        span.set_attribute("input.value", user_message)

        # Retrieve relevant documents
        relevant_docs = retrieve_documents(user_message, knowledge_base)

        # Generate response with retrieved context
        context = "\n".join(relevant_docs)
        response = openai_client.chat.completions.create(
            model="gpt-4",
            messages=[
                {
                    "role": "system",
                    "content": f"Answer the user's question using this context:\n{context}"
                },
                {"role": "user", "content": user_message}
            ],
        )

        answer = response.choices[0].message.content
        span.set_attribute("output.value", answer)

        return answer

# Try it out
response = handle_faq_query("What's your shipping policy?")
print(response)
Perfect! Now you can see exactly:
  • What query was sent to retrieval
  • Which documents were retrieved
  • What context was passed to the LLM
  • What answer was generated

Putting It All Together

Here’s the complete SupportBot with full tracing:
def supportbot(user_message: str) -> str:
    """Main SupportBot entry point with complete tracing."""

    with tracer.start_as_current_span("supportbot") as span:
        span.set_attribute("openinference.span.kind", "AGENT")
        span.set_attribute("input.value", user_message)

        # Step 1: Classify
        classification = classify_query(user_message)

        # Step 2: Route and handle
        if classification == "ORDER_STATUS":
            response = handle_order_query(user_message)
        else:
            response = handle_faq_query(user_message)

        span.set_attribute("output.value", response)
        span.set_attribute("classification", classification)

        return response

# Test it out
queries = [
    "Where is my order #12345?",
    "What's your return policy?",
    "Can I track my order?",
]

for query in queries:
    print(f"\nUser: {query}")
    print(f"Bot: {supportbot(query)}")
Congratulations! You now have complete visibility into your LLM application!

What’s Next?

Tracing gives you visibility, but how do you measure quality at scale? In the next chapter, Annotations and Evaluations, you’ll learn how to:
  • Annotate traces directly in the Arize AX UI
  • Run automated quality evaluations
  • Identify systemic issues across thousands of traces
  • Transform debugging from manual inspection into data-driven analysis
Let’s continue! →