> ## Documentation Index > Fetch the complete documentation index at: https://arize-ax.mintlify.site/docs/llms.txt > Use this file to discover all available pages before exploring further. # Track Costs > Configure cost tracking to monitor LLM spend by model, provider, and token type Know what every request costs. A single agent might chain five LLM calls — without cost tracking, you're guessing what you're spending. Arize AX calculates cost for every span, aggregates it at the trace level, and lets you filter, monitor, and optimize from there. # What is Cost Tracking? Arize AX calculates the cost of every LLM call in your traces — at the trace level (total cost of a request) and at the span level (cost of each individual LLM call). Use it to: * Spot which requests or agents are expensive and why * Track spend across models and providers over time * Catch cost spikes before they become budget problems * Compare cost/quality tradeoffs between different models Arize AX includes **default cost configurations for common models** (GPT-4o, Claude, Gemini, Mistral, and more), making it easy to get started with no setup required in many cases. If there is a default you'd like us to add, reach out to [support@arize.com](mailto:support@arize.com) # Token Tracking Arize AX tracks token usage via standard [OpenInference](https://github.com/Arize-ai/openinference) attributes on your LLM spans: | Attribute | Description | | :--------------------------- | :------------------------------------------- | | `llm.token_count.prompt` | Number of tokens in the prompt | | `llm.token_count.completion` | Number of tokens in the completion | | `llm.token_count.total` | Total number of tokens (prompt + completion) | Cost is calculated based on these token counts and the cost configuration for the model. The system supports multiple token types for detailed cost breakdowns: | Token Type | Category | Description | | :------------ | :--------- | :------------------------------------ | | `input` | Prompt | Regular input tokens | | `cache` | Prompt | Cached prompt tokens | | `cache_read` | Prompt | Cache read tokens | | `cache_write` | Prompt | Cache write tokens | | `cache_input` | Prompt | Cached input tokens | | `output` | Completion | Regular output tokens | | `reasoning` | Completion | Reasoning tokens (e.g., o1/o3 models) | | `audio` | Both | Audio tokens | Cost configs also support **tiered pricing** — volume-based pricing where cost per token changes based on total token count thresholds. These token counts are how Arize AX calculates cost: # How Cost Tracking Works When a span is received, Arize AX determines cost as follows: 1. If the span already includes cost attributes (set by the client), those values are used as-is. 2. Otherwise, the system looks up a cost configuration by matching `llm.model_name` and `llm.provider`. 3. The matching config's per-token rates are applied to the span's token counts. 4. Cost configs are cached with a 10-minute TTL for performance. Cost attributes on spans: | Attribute | Description | | :------------------------------ | :-------------------------------------- | | `llm.cost.prompt` | Total prompt cost | | `llm.cost.completion` | Total completion cost | | `llm.cost.total` | Total cost | | `llm.cost.prompt_details.*` | Cost breakdown by prompt token type | | `llm.cost.completion_details.*` | Cost breakdown by completion token type | # Set Up Cost Tracking ### 1. Use a Default (Zero Setup) If your model and provider match a default, Arize AX automatically applies the correct pricing — no action needed. ### 2. Customize a Default To tweak an existing config (e.g., apply discounts): * Go to **Settings > Cost Tracking > Configuration** * Click **Options > Clone** on a default config * Edit fields like token type cost or provider name Customizing cost tracking config in Arize AX

Customizing cost tracking config in Arize AX

### 3. Create from Scratch To define your own model config: * Click **Add New** * Enter the **model name** (required) * Optionally enter the **provider** * Specify cost per 1 million tokens for each token type * Assign each token type to Prompt or Completion Cost configs are saved at the **organization level**. Creating custom cost tracking config

# Using Cost Data Once configured, cost data is available across the platform. ### Filtering and Monitoring All cost attributes are available throughout the platform and can be used to: * Filter traces or spans where cost exceeds a defined threshold * Create monitors for high-cost traces or model behavior anomalies * Build dashboards based on specific token types or cost groupings ### Trace-Level Visualization At the trace level, Arize AX aggregates cost across all LLM spans in the trace. This provides a complete view of how much it cost to serve a given request end-to-end. Trace-level cost aggregated across all LLM spans in a request in Arize AX

Trace-level cost aggregated across all LLM spans in a request in Arize AX

### Span-Level Visualization You can also inspect cost at the individual span level, including a breakdown by token type. This allows you to: * Pinpoint expensive steps in the LLM pipeline * Analyze the relative contribution of different token categories (e.g., reasoning, cache, image) LLM span Attributes tab in Arize AX showing llm.cost breakdown with prompt, completion, prompt_details, completion_details, and total

LLM span Attributes tab in Arize AX showing llm.cost breakdown with prompt, completion, prompt_details, completion_details, and total

## Lookup Logic To determine cost: 1. We extract the model name from your trace using the following fallback order: * `llm.model_name` (Primary) * `llm.invocation_parameters.model` (Fallback 1) * `metadata.model` (Fallback 2) 2. Optionally, if you provide a `provider`, we'll match that as well (e.g., differentiating OpenAI vs Azure OpenAI for `gpt-4`). 3. Each token type (e.g., prompt, completion, audio) is matched against the configuration, and the cost is calculated per million tokens (1M token unit basis). **Important:** Cost is not retroactive. To track costs, you must configure pricing before ingesting traces. ## Supported Token Types and Semantic Conventions You can send any token types using [OpenInference semantic conventions](https://github.com/Arize-ai/openinference/blob/main/spec/semantic_conventions.md). Below are the supported fields: ### Prompt Tokens | Token Type | Field Name | | :------------------------------------------ | :------------------------------------------- | | Prompt (Includes all input subtypes to LLM) | `llm.token_count.prompt` | | Prompt Details | `llm.token_count.prompt_details` | | Audio | `llm.token_count.prompt_details.audio` | | Image | `llm.token_count.prompt_details.image` | | Cache Input | `llm.token_count.prompt_details.cache_input` | | Cache Read | `llm.token_count.prompt_details.cache_read` | | Cache Write | `llm.token_count.prompt_details.cache_write` | ### Completion Tokens | Token Type | Field Name | | :------------------------------------------------- | :--------------------------------------------- | | Completion (Includes all output subtypes from LLM) | `llm.token_count.completion` | | Audio | `llm.token_count.completion_details.audio` | | Reasoning | `llm.token_count.completion_details.reasoning` | | Image | `llm.token_count.completion_details.image` | ### Total Tokens (Optional) `llm.token_count.total` ### Custom Token Types You can also define custom token types under either `prompt_details` or `completion_details`. Just make sure to: * Use semantic naming * Include a matching token type and cost in your configuration Each token sent will have a cost calculated provided a matching token type is defined in your configuration. *** ## Next step Configure your OpenTelemetry tracer for production — batch processing, routing, and resource attributes: