Skip to main content

Documentation Index

Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

The OpenTelemetry Collector is a service that receives telemetry, processes it, and exports it to one or more backends. It’s optional — most applications send spans directly from the SDK to Arize AX — but it unlocks centralized policy, multi-backend fan-out, dynamic routing, and tail sampling that the SDK can’t express. For practical Collector configuration including authentication and dynamic project routing, see Advanced Patterns: OTEL Collector. This page covers the concept.

Anatomy of a Collector

Collectors are composed of pipelines. Each pipeline chains three component types:
ComponentRole
ReceiverListens for incoming telemetry (OTLP over gRPC, OTLP over HTTP, Jaeger, Zipkin, …).
ProcessorTransforms, filters, enriches, batches, or samples spans as they flow through.
ExporterSends the processed telemetry out — to Arize AX, to another backend, to multiple backends, to disk.
All components and pipelines are defined in a single config.yaml. The Collector’s flexibility comes from how those components are composed.
OpenTelemetry Collector architecture: receivers → processors → exporters in a pipeline
For the canonical specification, see OpenTelemetry Collector documentation.

Deployment Models

Three common ways to deploy a Collector:
ModelHow it worksBest for
AgentCollector runs alongside the application — as a sidecar container, a daemonset, or a local process.Simple setups, clear 1:1 mapping between app and collector.
GatewayA central collector receives from many applications.Centralized policy, credential management, single point of egress.
HybridAgent collectors forward to a centralized gateway.Large environments — distributed collection plus centralized processing.

Common Use Cases

Reasons to put a Collector between your applications and Arize AX:
Use caseWhat the Collector does
FilteringDrop spans by name, attribute, or pattern using a filter processor — useful for cutting noise from health checks or known-uninteresting paths.
PII redaction or attribute modificationUse a transform processor to scrub sensitive fields, hash user IDs, or replace values before export.
Fan-out to multiple backendsSend the same spans to Arize AX and a long-term storage system, or to Arize AX and a metrics backend, without duplicating instrumentation in the app.
Dynamic project routingRoute traces to different Arize AX projects or spaces based on span attributes — see the dynamic routing example.
Tail samplingBuffer complete traces and decide which to keep based on outcome (error spans, slow requests, specific attributes).
Credential centralizationApplication code stays free of Arize AX API keys; the Collector handles authentication at a single chokepoint.
Tail-end batchingReduce the number of network round-trips from your fleet to Arize AX by batching across applications at the Collector.

Common Pitfalls

A few Collector failure modes to know about:
  • Forgetting Arize AX authentication — the Collector needs to add arize-space-id and arize-api-key (or space_id/api_key) headers to outbound requests. Use the headers_setter extension or set them in the exporter configuration. Without them, the Arize AX collector rejects spans.
  • Modifying shared span objects across pipelines — when one pipeline mutates a span, every other pipeline that processes the same span sees the modification. Use a routing connector to duplicate spans cleanly before fan-out.
  • No batch processor at the end of the pipeline — for production volumes, the last processor in a pipeline should be a batch processor. Without it, the Collector exports one span at a time, which is inefficient.
  • Wrong Collector endpoint — applications need to point at the Collector’s OTLP endpoint (gRPC: :4317, HTTP: :4318), not at Arize AX directly. Mixing the two is a common source of “why are some spans missing?” debugging.
  • Receiver missing include_metadata: true — when a centralized gateway uses inbound request metadata for routing (e.g., reading the target Arize AX space from a header), the receiver has to be told to make that metadata available. Without it, the routing extension has nothing to read.

Where the Collector Fits

Both with and without a Collector, your application code looks the same — the Exporter points at an OTLP endpoint. The difference is just which OTLP endpoint:
Without Collector:
  App → Exporter → otlp.arize.com

With Collector:
  App → Exporter → Collector → (processors) → Arize AX (and/or other backends)
Start without a Collector. Add one when you need centralized policy, multi-backend fan-out, tail sampling, or routing logic the SDK can’t express.

You’ve completed the OpenTelemetry and OpenInference reference

Every concept in this section — signals, the four OTel components, the Arize AX helpers, OpenInference conventions, span kinds, instrumentation approaches, context, sampling, and the Collector — is now in your toolbox. Time to instrument:

Set up tracing