LlamaIndex Workflows Tracing

How to trace LlamaIndex Workflow applications and send data to Arize.

LlamaIndex Workflows are a powerful feature within LlamaIndex for building complex, agentic applications. Arize allows you to observe these workflows by capturing traces through the OpenInference LlamaIndexInstrumentor.

Note: The standard LlamaIndexInstrumentor (covered in the main LlamaIndex Tracing with Arize guide) automatically captures traces for LlamaIndex Workflows. If you've already instrumented LlamaIndex following that guide, your workflows should already be traced. This page provides a consolidated view for workflow-specific considerations.

We recommend using llama_index >= 0.11.0 for the best experience with Workflows and OpenInference.

API Key Setup

Before running your application, ensure you have the following environment variables set:

export ARIZE_SPACE_ID="YOUR_ARIZE_SPACE_ID"
export ARIZE_API_KEY="YOUR_ARIZE_API_KEY"
export OPENAI_API_KEY="YOUR_OPENAI_API_KEY" # For LlamaIndex examples using OpenAI

You can find your Arize Space ID and API Key in your Arize account settings.

Install

Ensure you have LlamaIndex and the OpenInference LlamaIndex instrumentor installed, along with Arize OTel and supporting OpenTelemetry packages:

pip install llama-index openinference-instrumentation-llama-index arize-otel opentelemetry-sdk opentelemetry-exporter-otlp

Setup Tracing

Initialize the LlamaIndexInstrumentor after setting up the Arize OpenTelemetry exporter. This same setup instruments both general LlamaIndex usage and LlamaIndex Workflows.

import os
from arize.otel import register
from openinference.instrumentation.llama_index import LlamaIndexInstrumentor

# Ensure your API keys are set as environment variables
# ARIZE_SPACE_ID = os.getenv("ARIZE_SPACE_ID")
# ARIZE_API_KEY = os.getenv("ARIZE_API_KEY")
# OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") # For the example

# Setup OTel via Arize's convenience function
tracer_provider = register(
    space_id=os.getenv("ARIZE_SPACE_ID"),
    api_key=os.getenv("ARIZE_API_KEY"),
    project_name="my-llamaindex-workflows-app" # Choose a project name
)

# Instrument LlamaIndex (this covers Workflows as well)
LlamaIndexInstrumentor().instrument(tracer_provider=tracer_provider)

print("LlamaIndex (including Workflows) instrumented for Arize.")

Run LlamaIndex Workflows Example

By instrumenting LlamaIndex as shown above, spans will be created whenever a workflow or agent is invoked and will be sent to Arize.

# Example (conceptual - refer to LlamaIndex documentation for specific Workflow examples)
# Ensure OPENAI_API_KEY is set in your environment

from llama_index.core.workflow import (    
    Workflow,
    Event,
    Context,
    step
)
from llama_index.llms.openai import OpenAI # Example LLM

# Define a simple workflow (from LlamaIndex docs, adapted for context)
@step()
def add_one(val: int) -> Context:
    print(f"Adding one to {val}")
    return Context(value = val + 1)

@step()
def multiply_by_llm_output(context: Context, query: str) -> Context:
    print(f"Multiplying {context.get('value')} by LLM output for query: {query}")
    llm = OpenAI(model="gpt-3.5-turbo")
    response = llm.complete(f"Return a single number: {query}").text
    try:
        multiplier = int(response.strip())
        return Context(value = context.get('value') * multiplier, llm_response=response)
    except ValueError:
        print(f"Could not parse LLM output as integer: {response}")
        return Context(value = context.get('value'), llm_response=response, error="LLM output not an int")

my_workflow = Workflow(timeout=60) #name="My Simple Workflow", description="Adds one then multiplies by an LLM-generated number")
my_workflow.add_step(add_one, initial_value=Event())
my_workflow.add_step(multiply_by_llm_output, prev_step_output=True, query=Event())

# Run the workflow
initial_value = 5
query_for_llm = "What is 2 + 2?"
try:
    result_context = my_workflow.run(initial_value=initial_value, query=query_for_llm)
    print(f"Workflow Result: {result_context.get('value')}")
    if result_context.get('error'):
        print(f"Error during execution: {result_context.get('error')}")
except Exception as e:
    print(f"Workflow failed: {e}")

Observe in Arize

Now that you have tracing set up, all invocations of LlamaIndex Workflows will be streamed to Arize for observability and evaluation. You can visualize the steps within your workflows, the inputs and outputs of each step, and any LLM calls made.

Resources

LlamaIndex Workflows Blog
LlamaIndex Documentation
OpenInference LlamaIndex Instrumentor (this covers workflows)
Arize LlamaIndex Tracing Guide (for general LlamaIndex setup)

Last updated 28 days ago

Was this helpful?