Llama Tracing

To instrument an open-source Llama model, Ollama has built-in compatibility with the OpenAI Chat Completions API, making it possible to use more tooling and applications with open-source models locally.

Prerequisites

Install Ollama

Download and install Ollama, or execute the installation script with the command below. Both options handle the installation process automatically, including downloading and installing necessary dependencies.

!curl https://ollama.ai/install.sh | sh

Launch Ollama

Once Ollama is installed, start the server using the following command.

ollama serve

# Import open-telemetry dependencies
import ollama
from arize.otel import Endpoint, register
from openai import OpenAI

# Declare model name
LLAMA_MODEL_NAME = "llama3.2:1b"

# Download the llama3.2:1b model to run locally.
ollama.pull(OLLAMA_MODEL_NAME)

# Setup OTEL via our convenience function
register(
    space_id = "your-space-id", # in app space settings page
    api_key = "your-api-key", # in app space settings page
    model_id = "your-model-id", # name this to whatever you would like
)

# Import the automatic instrumentor from OpenInference
from openinference.instrumentation.openai import OpenAIInstrumentor

# Finish automatic instrumentation
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

client = OpenAI(
    base_url = 'http://localhost:11434/v1',
    api_key='ollama', # required, but unused
)

query = "Why is the sky blue?"

response = oai_client.chat.completions.create(
    model=LLAMA_MODEL_NAME,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": query},
        ]
    )

Now start asking questions to your LLM app and watch the traces being collected by Arize. For more examples of instrumenting OpenAI applications, check our openinferenece-instrumentation-openai examples.

Was this helpful?