Skip to main content
Instrument at scale. These patterns are for production systems — routing through an OTEL Collector for centralized processing, propagating context across services and async boundaries, sampling to control volume and cost, and resilient deployment patterns.

OTEL Collector

The OpenTelemetry Collector acts as an intermediate processing layer between your applications and Arize. It collects, processes, and routes telemetry data — useful for centralized credential management, data masking, multi-backend routing, and compliance.
OpenTelemetry Collector architecture

Deployment Models

ModelHow it worksBest for
Agent modeCollector runs alongside the app (sidecar or daemonset)Simple setups, clear 1:1 mapping
Gateway modeCentralized collector receives from multiple appsCentralized policy, credential management
HybridAgent collectors forward to a gatewayLarge environments, distributed collection + centralized processing

Example Configuration

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 5s
    send_batch_size: 2048
  resource/model_id:
    attributes:
      - action: insert
        key: model_id
        value: "your-project-name"

exporters:
  otlp/arize:
    endpoint: "otlp.arize.com:443"
    headers:
      api_key: "${ARIZE_API_KEY}"
      space_id: "${ARIZE_SPACE_ID}"
    timeout: 20s
    retry_on_failure:
      enabled: true
      initial_interval: 1s
      max_interval: 10s

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [resource/model_id, batch]
      exporters: [otlp/arize]

Dynamic Trace Routing

Route traces to different Arize projects based on span attributes:
extensions:
  headers_setter/space1:
    headers:
      - key: space_id
        value: space_id_1
        action: upsert
      - key: api_key
        value: api_key_1
        action: upsert
  headers_setter/space2:
    headers:
      - key: space_id
        value: space_id_2
        action: upsert
      - key: api_key
        value: api_key_2
        action: upsert

processors:
  transform:
    error_mode: ignore
    trace_statements:
      - set(resource.attributes["openinference.project.name"], span.attributes["metadata.project_name"])
      - set(resource.attributes["space_id"], span.attributes["metadata.space_id"])

connectors:
  routing:
    default_pipelines: [traces/space1]
    table:
      - context: resource
        condition: resource.attributes["space_id"] == "space_id_1"
        pipelines: [traces/space1]
      - context: resource
        condition: resource.attributes["space_id"] == "space_id_2"
        pipelines: [traces/space2]

exporters:
  otlp/space1:
    endpoint: "otlp.arize.com:443"
    auth:
      authenticator: headers_setter/space1
  otlp/space2:
    endpoint: "otlp.arize.com:443"
    auth:
      authenticator: headers_setter/space2

service:
  extensions: [headers_setter/space1, headers_setter/space2]
  pipelines:
    traces:
      receivers: [otlp]
      processors: [transform]
      exporters: [routing]
    traces/space1:
      receivers: [routing]
      exporters: [otlp/space1]
    traces/space2:
      receivers: [routing]
      exporters: [otlp/space2]
Set routing attributes in your app:
span.set_attribute("metadata.project_name", "your-project-name")
span.set_attribute("metadata.space_id", "space_id_1")
OTEL Collector routing traces from one application to multiple Arize spaces based on span attributes

Alternative for Centralized Gateway Collectors

If you operate a centralized collector that serves many teams, you may not want to redeploy the collector every time a new Arize space is added. In that setup, have each application send the target arize-space-id as request metadata to the collector, then configure the collector to forward that metadata as an outbound header to Arize. This pattern works well when:
  • The collector is shared across many teams
  • Teams are responsible for selecting their own target Arize space
  • You want to avoid maintaining a separate headers_setter instance for every space
When using this approach, make sure the OTLP receiver is configured with include_metadata: true; otherwise, the inbound request headers will not be available to the headers_setter extension.
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
        include_metadata: true
      http:
        endpoint: 0.0.0.0:4318
        include_metadata: true

extensions:
  headers_setter:
    headers:
      - action: upsert
        key: arize-space-id
        from_context: arize-space-id

exporters:
  otlp/arize:
    endpoint: "otlp.arize.com:443"
    auth:
      authenticator: headers_setter
Applications should still set the project name as a resource attribute, for example openinference.project.name, so traces continue to land in the expected Arize project. This approach is best suited for trusted internal environments where the collector is allowed to honor caller-provided routing metadata. Beyond centralized routing, you may also need to handle tracing context across async boundaries and services:

Manual Context Propagation

OpenTelemetry handles context propagation automatically in most cases. For async workflows or custom concurrency, you may need to do it manually.

Async Functions

import asyncio
from opentelemetry import trace
from opentelemetry.context import attach, detach, get_current

tracer = trace.get_tracer(__name__)

async def async_func(ctx):
    token = attach(ctx)
    try:
        current_span = trace.get_current_span()
        current_span.set_attribute("input.value", "User Input")
        await asyncio.sleep(1)
    finally:
        detach(token)

def sync_func():
    with tracer.start_as_current_span("sync_span") as span:
        context = get_current()
        asyncio.run(async_func(context))

Multi-Service Propagation

Propagate tracing context across HTTP calls between microservices: Service A (sends request):
import requests
from opentelemetry import trace, propagate

tracer = trace.get_tracer(__name__)

def make_request_to_service_b():
    with tracer.start_as_current_span("llm_service_a") as span:
        headers = {}
        propagate.inject(headers)
        response = requests.get("http://service-b:5000/endpoint", headers=headers)
        return response.text
Service B (receives request):
from flask import Flask, request
from opentelemetry import trace, propagate

app = Flask(__name__)
tracer = trace.get_tracer(__name__)

@app.route("/endpoint")
def endpoint():
    context = propagate.extract(dict(request.headers))
    with tracer.start_as_current_span("service_b_processing", context=context) as span:
        span.add_event("Received request in service B")
        return "Hello from Service B"

ThreadPoolExecutor

Preserve tracing context when submitting tasks to a thread pool:
import concurrent.futures
from opentelemetry import trace
from opentelemetry.context import attach, detach, get_current

tracer = trace.get_tracer(__name__)

def wrapped_func(func):
    """Captures context from main thread, attaches in worker thread."""
    main_context = get_current()
    def wrapper():
        token = attach(main_context)
        try:
            return func()
        finally:
            detach(token)
    return wrapper

# Usage — wrap in the MAIN thread so main_context is captured before submission
with concurrent.futures.ThreadPoolExecutor() as executor:
    funcs = [func1, func2, func3]
    wrapped = [wrapped_func(f) for f in funcs]
    results = list(executor.map(lambda w: w(), wrapped))
If you’re generating too many spans, you can selectively control which ones get recorded:

Custom Sampling

Control which spans get recorded to manage telemetry volume and cost:
from opentelemetry.context import Context
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.sampling import Sampler, SamplingResult, Decision

class UserBasedSampler(Sampler):
    """Drop spans for a specific user ID."""
    def should_sample(
        self, parent_context, trace_id, name,
        kind=None, attributes=None, links=None, trace_state=None,
    ):
        user_id = (attributes or {}).get("user.id")
        if user_id == "user-to-drop":
            return SamplingResult(
                decision=Decision.DROP,
                attributes={"sampler.reason": f"Dropping user.id={user_id}"},
            )
        return SamplingResult(decision=Decision.RECORD_AND_SAMPLE, attributes={})

    def get_description(self) -> str:
        return "UserBasedSampler"

tracer_provider = TracerProvider(sampler=UserBasedSampler())
One gotcha when mixing manual spans with context attributes:

Inheriting Context Attributes in Manual Spans

Context attributes from using_session, using_metadata, etc. are NOT automatically attached to manually created spans. Use this helper to pull them in:
from openinference.instrumentation import get_attributes_from_context

def create_span_with_context(tracer, name, **kwargs):
    with tracer.start_as_current_span(name, **kwargs) as span:
        context_attributes = dict(get_attributes_from_context())
        span.set_attributes(context_attributes)
        return span

# Usage
with using_session("my-session-id"):
    with create_span_with_context(tracer, "my-manual-span") as span:
        # span now has session.id attached
        ...
Finally, for production resilience:

Health Check Pattern

Validate endpoint connectivity before initializing the tracer — graceful fallback when services are unavailable:
import httpx
import logging
from opentelemetry.trace import NoOpTracerProvider
from arize.otel import register

def create_tracer_provider():
    try:
        with httpx.Client(timeout=3.0) as client:
            response = client.get("https://otlp.arize.com")
            response.raise_for_status()

        logging.info("Tracing endpoint healthy — initializing Arize tracer")
        return register(
            space_id="your-space-id",
            api_key="your-api-key",
            project_name="your-project",
        )
    except Exception as e:
        logging.warning(f"Tracing unavailable: {e}. Using NoOp tracer.")
        return NoOpTracerProvider()

You’ve completed the Instrument workflow

Your application is fully instrumented — traces flow to Arize AX with the data, context, and configuration you need. Now start observing:

Next: View Your Traces