Skip to main content
This notebook demonstrates split-stream OpenTelemetry tracing for agents running on Databricks: the same spans are exported to Arize AX for real-time observability and to Unity Catalog Delta tables for governed storage.
This notebook will only run in a Databricks workspace environment.

Link to Notebook Tutorial

In this notebook you learn to:
  • Configure a single OpenTelemetry TracerProvider with two span processors — one exporting to Arize AX and one exporting to Databricks Unity Catalog
  • Stream OTLP spans to Arize AX for real-time trace exploration, evaluation, and monitoring
  • Stream the same OTLP spans into Unity Catalog Delta tables for retention, governance, and SQL analytics
  • Instrument an external-style OpenAI + LangChain tool-calling agent with OpenInference auto-instrumentation
  • Bind an MLflow experiment to Unity Catalog so traces are queryable via Databricks SQL
  • Verify ingest on both streams — SQL against the UC spans table and the trace explorer in Arize AX

Prerequisites

  • Databricks account and Unity Catalog workspace with the OpenTelemetry on Databricks preview enabled (Sign up for free)
  • Arize AX account (Sign up for free)
  • UC permissions: USE CATALOG, USE SCHEMA, CREATE SCHEMA (if needed), and MODIFY + SELECT on the OTel tables
  • A SQL warehouse ID with CAN USE
  • An OpenAI API key

Install dependencies

%pip install \
  "mlflow[databricks]>=3.11.0" \
  "arize-otel>=0.8.0" \
  "openinference-instrumentation-langchain>=0.1.30" \
  "langchain>=1.0.0" \
  "langchain-core>=0.3.0" \
  "langchain-openai>=0.2.0" \
  "opentelemetry-exporter-otlp-proto-http>=1.27.0" \
  "openinference-semantic-conventions>=0.1.12" \
  --quiet
dbutils.library.restartPython()

Access Arize AX and Databricks keys from Databricks Secrets

Create an Arize AX API key and Space ID for the items below. Use Databricks Secrets for secure access of keys, then map them to environment variables. Run this configuration section after the install cell restarts Python. Set the widgets below or map secrets to environment variables in your cluster policy:
VariableDescription
catalog_name / schema_name / table_prefixUnity Catalog trace location
experiment_nameMLflow experiment (optional trace UI)
sql_warehouse_idWarehouse for trace search / SQL
arize_project_nameArize project for spans
Secrets: ARIZE_SPACE_ID, ARIZE_API_KEY, DATABRICKS_HOST, DATABRICKS_TOKEN, OPENAI_API_KEY
import os

dbutils.widgets.text("catalog_name", "main", "UC catalog")
dbutils.widgets.text("schema_name", "otel_traces", "UC schema")
dbutils.widgets.text("table_prefix", "partner_demo", "UC table prefix")
dbutils.widgets.text(
    "experiment_name",
    "/Shared/partner-dual-ingest-demo",
    "MLflow experiment",
)
dbutils.widgets.text("sql_warehouse_id", "default-warehouse-id", "SQL warehouse ID")
dbutils.widgets.text("arize_project_name", "databricks-dual-ingest-demo", "Arize project")
dbutils.widgets.text("openai_model", "gpt-4o-mini", "OpenAI model")
dbutils.widgets.text(
    "demo_user_message",
    "What does our travel policy say about demo expenses?",
    "Demo prompt",
)

# Load credentials from a secret scope (replace scope name with your own)
scope = "arize-databricks-partner"
os.environ["ARIZE_SPACE_ID"] = dbutils.secrets.get(scope=scope, key="arize-space-id")
os.environ["ARIZE_API_KEY"] = dbutils.secrets.get(scope=scope, key="arize-api-key")
os.environ["DATABRICKS_HOST"] = dbutils.secrets.get(scope=scope, key="databricks-host")
os.environ["DATABRICKS_TOKEN"] = dbutils.secrets.get(scope=scope, key="databricks-token")
os.environ["OPENAI_API_KEY"] = dbutils.secrets.get(scope=scope, key="openai-api-key")

catalog_name = dbutils.widgets.get("catalog_name").strip()
schema_name = dbutils.widgets.get("schema_name").strip()
table_prefix = dbutils.widgets.get("table_prefix").strip()
experiment_name = dbutils.widgets.get("experiment_name").strip()
sql_warehouse_id = dbutils.widgets.get("sql_warehouse_id").strip()
arize_project_name = dbutils.widgets.get("arize_project_name").strip()
openai_model = dbutils.widgets.get("openai_model").strip()
demo_user_message = dbutils.widgets.get("demo_user_message").strip()

if not sql_warehouse_id:
    raise ValueError("Set the sql_warehouse_id widget to a warehouse you can use.")

os.environ["MLFLOW_TRACING_SQL_WAREHOUSE_ID"] = sql_warehouse_id

# Default to the current workspace URL when running on Databricks
if not os.environ.get("DATABRICKS_HOST"):
    try:
        os.environ["DATABRICKS_HOST"] = (
            dbutils.notebook.entry_point.getDbutils()
            .notebook()
            .getContext()
            .apiUrl()
            .get()
            .rstrip("/")
        )
    except Exception:
        pass

Bind the MLflow experiment to Unity Catalog

This creates the {prefix}_otel_spans table (and related tables) and links the MLflow experiment for optional trace UI browsing. The full table name is used later as the export target for the Databricks span processor.
import mlflow
from mlflow.entities.trace_location import UnityCatalog

mlflow.set_tracking_uri("databricks")

experiment = mlflow.set_experiment(
    experiment_name=experiment_name,
    trace_location=UnityCatalog(
        catalog_name=catalog_name,
        schema_name=schema_name,
        table_prefix=table_prefix,
    ),
)

uc_spans_table = experiment.trace_location.full_otel_spans_table_name
print(f"Experiment ID: {experiment.experiment_id}")
print(f"UC spans table: {uc_spans_table}")

Configure dual OTLP export

This is the core of the pattern: a single OpenTelemetry TracerProvider with two span processors. Spans are produced once by the instrumented agent and fanned out to both destinations. Stream 1 — Arize AX. The Arize exporter ships OTLP spans over gRPC by default for sub-second ingest. The arize-otel package provides the exporter, endpoint, and transport helpers. The arize_project_name is set as a resource attribute and is required for spans to land in the correct project. Stream 2 — Databricks Unity Catalog. A standard OTLP exporter posts the same spans to the Databricks OTel collector endpoint (/api/2.0/otel/v1/traces). The X-Databricks-UC-Table-Name header routes spans into the Unity Catalog Delta table created above.
import os
from typing import Optional

from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

from arize.otel import BatchSpanProcessor as ArizeBatchSpanProcessor
from arize.otel import Endpoint
from arize.otel import GRPCSpanExporter as ArizeGRPCSpanExporter
from arize.otel import HTTPSpanExporter as ArizeHTTPSpanExporter
from arize.otel import Transport
from openinference.semconv.resource import ResourceAttributes


def _arize_endpoint():
    raw = (os.environ.get("ARIZE_COLLECTOR_ENDPOINT") or "").strip()
    if "eu-west" in raw:
        return Endpoint.ARIZE_EUROPE
    if raw.startswith("http://") or raw.startswith("https://"):
        return raw.rstrip("/")
    return Endpoint.ARIZE


def _arize_transport():
    mode = (os.environ.get("ARIZE_TRANSPORT") or "grpc").strip().lower()
    return Transport.HTTP if mode == "http" else Transport.GRPC


def _require(name: str, value: Optional[str]) -> str:
    if not value or not str(value).strip():
        raise ValueError(f"Missing required configuration: {name}")
    return str(value).strip()


def configure_dual_export(
    *,
    project_name: str,
    uc_spans_table: str,
    service_name: str = "langchain-external-agent-demo",
) -> TracerProvider:
    project_name = _require("project_name", project_name)
    space_id = _require(
        "arize_space_id",
        os.environ.get("ARIZE_SPACE_ID") or os.environ.get("ARIZE_SPACE"),
    )
    api_key = _require("arize_api_key", os.environ.get("ARIZE_API_KEY"))
    host = _require("databricks_host", os.environ.get("DATABRICKS_HOST")).rstrip("/")
    token = _require("databricks_token", os.environ.get("DATABRICKS_TOKEN"))

    resource = Resource.create(
        {
            ResourceAttributes.PROJECT_NAME: project_name,
            "openinference.project.name": project_name,
            "service.name": service_name,
            "model_id": project_name,
        }
    )
    provider = TracerProvider(resource=resource)

    # --- Stream 1: Arize AX (gRPC by default) ---
    endpoint = _arize_endpoint()
    transport = _arize_transport()
    if transport == Transport.GRPC:
        arize_exporter = ArizeGRPCSpanExporter(
            space_id=space_id, api_key=api_key, endpoint=endpoint
        )
    else:
        arize_exporter = ArizeHTTPSpanExporter(
            space_id=space_id, api_key=api_key, endpoint=endpoint
        )

    provider.add_span_processor(ArizeBatchSpanProcessor(span_exporter=arize_exporter))
    print(
        f"Arize export: transport={transport.value}, endpoint={endpoint}, project={project_name}"
    )

    # --- Stream 2: Databricks Unity Catalog (OTLP HTTP) ---
    dbx_exporter = OTLPSpanExporter(
        endpoint=f"{host}/api/2.0/otel/v1/traces",
        headers={
            "content-type": "application/x-protobuf",
            "X-Databricks-UC-Table-Name": uc_spans_table,
            "Authorization": f"Bearer {token}",
        },
    )
    provider.add_span_processor(BatchSpanProcessor(dbx_exporter))
    trace.set_tracer_provider(provider)
    return provider


def shutdown_tracer(provider: TracerProvider) -> None:
    provider.force_flush()
    provider.shutdown()


tracer_provider = configure_dual_export(
    project_name=arize_project_name,
    uc_spans_table=uc_spans_table,
    service_name="langchain-external-agent-demo",
)

Instrument and run the demo agent

With the dual-export TracerProvider in place, a single line of OpenInference auto-instrumentation captures every LLM call and tool invocation as OTLP spans — which the two processors then fan out to Arize AX and Unity Catalog. The demo workload is an OpenAI + LangChain tool-calling agent with a simple policy-lookup tool.
from langchain.agents import create_agent
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from openinference.instrumentation.langchain import LangChainInstrumentor

# 1 line auto instrumentation
LangChainInstrumentor().instrument(tracer_provider=tracer_provider)

@tool
def lookup_policy(topic: str) -> str:
    """Return a short internal policy snippet for the given topic."""
    policies = {
        "travel": "Demo travel expenses under $500 are pre-approved for partner workshops.",
        "security": "All agent outputs must stay within approved data boundaries.",
    }
    return policies.get(topic.lower(), f"No policy on file for '{topic}'.")

llm = ChatOpenAI(model=openai_model, temperature=0)
tools = [lookup_policy]

agent = create_agent(
    llm,
    tools=tools,
    system_prompt=(
        "You are an enterprise assistant. Use tools when you need policy details. Be concise."
    ),
)

print("Running demo agent...")
result = agent.invoke({"messages": [{"role": "user", "content": demo_user_message}]})

messages = result.get("messages", [])
print("\n--- Agent response ---")
print(messages[-1].content if messages else result)

Flush spans

Force the OTLP batches to be delivered before verification. This is required for short notebook runs where the batch processors may not flush on their own before the cell finishes.
shutdown_tracer(tracer_provider)
print("Tracer shut down and spans flushed.")

Verify Unity Catalog ingest with Databricks SQL

Spans should appear in {catalog}.{schema}.{prefix}_otel_spans within a short delay after export. Query the governed Delta table directly with Spark SQL.
import time
time.sleep(15)

display(
    spark.sql(f"""
        SELECT
          trace_id,
          span_id,
          name,
          kind,
          start_time_unix_nano,
          end_time_unix_nano,
          (end_time_unix_nano - start_time_unix_nano) / 1e6 AS duration_ms
        FROM {uc_spans_table}
        ORDER BY start_time_unix_nano DESC
        LIMIT 10
    """)
)

Verify traces in Arize AX

Open your Arize project (the arize_project_name widget value) in the AX UI. In the Arize AX platform you can see agent execution details, tool invocations, latency breakdown by component, token usage and costs, and metadata captured for each span. From here you can layer on online evaluations, dashboards, and monitors.

Optional: MLflow Experiment UI

In the workspace Experiments page, open {experiment_name}Traces tab (select your SQL warehouse). This is optional — SQL on the UC spans table is the primary governed-store proof.

Lakebase consumption pattern

Use Databricks SQL on the Unity Catalog span tables for analytics. For operational apps, treat Unity Catalog as the system of record and expose a permissioned read path via Lakebase if sub-second app reads are needed.

Next steps

With spans flowing into both Arize AX and Unity Catalog, you have one telemetry source feeding real-time observability and governed long-term storage. From here, set up online evaluations in Arize AX to score your agent’s quality continuously, build custom metrics from trace attributes, and run SQL analytics over the governed UC tables.

Resources

Databricks Resources

Store OpenTelemetry Traces in Unity Catalog

Databricks Secrets