> ## Documentation Index
> Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Track Costs

> Configure cost tracking to monitor LLM spend by model, provider, and token type

Know what every request costs. A single agent might chain five LLM calls — without cost tracking, you're guessing what you're spending. Arize AX calculates cost for every span, aggregates it at the trace level, and lets you filter, monitor, and optimize from there.

# What is Cost Tracking?

Arize AX calculates the cost of every LLM call in your traces — at the trace level (total cost of a request) and at the span level (cost of each individual LLM call). Use it to:

* Spot which requests or agents are expensive and why
* Track spend across models and providers over time
* Catch cost spikes before they become budget problems
* Compare cost/quality tradeoffs between different models

Arize AX includes **default cost configurations for common models** (GPT-4o, Claude, Gemini, Mistral, and more), making it easy to get started with no setup required in many cases.

<Info>
  If there is a default you'd like us to add, reach out to [support@arize.com](mailto:support@arize.com)
</Info>

# Token Tracking

Arize AX tracks token usage via standard [OpenInference](https://github.com/Arize-ai/openinference) attributes on your LLM spans:

| Attribute                    | Description                                  |
| :--------------------------- | :------------------------------------------- |
| `llm.token_count.prompt`     | Number of tokens in the prompt               |
| `llm.token_count.completion` | Number of tokens in the completion           |
| `llm.token_count.total`      | Total number of tokens (prompt + completion) |

Cost is calculated based on these token counts and the cost configuration for the model. The system supports multiple token types for detailed cost breakdowns:

| Token Type    | Category   | Description                           |
| :------------ | :--------- | :------------------------------------ |
| `input`       | Prompt     | Regular input tokens                  |
| `cache`       | Prompt     | Cached prompt tokens                  |
| `cache_read`  | Prompt     | Cache read tokens                     |
| `cache_write` | Prompt     | Cache write tokens                    |
| `cache_input` | Prompt     | Cached input tokens                   |
| `output`      | Completion | Regular output tokens                 |
| `reasoning`   | Completion | Reasoning tokens (e.g., o1/o3 models) |
| `audio`       | Both       | Audio tokens                          |

Cost configs also support **tiered pricing** — volume-based pricing where cost per token changes based on total token count thresholds.

These token counts are how Arize AX calculates cost:

# How Cost Tracking Works

When a span is received, Arize AX determines cost as follows:

1. If the span already includes cost attributes (set by the client), those values are used as-is.
2. Otherwise, the system looks up a cost configuration by matching `llm.model_name` and `llm.provider`.
3. The matching config's per-token rates are applied to the span's token counts.
4. Cost configs are cached with a 10-minute TTL for performance.

Cost attributes on spans:

| Attribute                       | Description                             |
| :------------------------------ | :-------------------------------------- |
| `llm.cost.prompt`               | Total prompt cost                       |
| `llm.cost.completion`           | Total completion cost                   |
| `llm.cost.total`                | Total cost                              |
| `llm.cost.prompt_details.*`     | Cost breakdown by prompt token type     |
| `llm.cost.completion_details.*` | Cost breakdown by completion token type |

# Set Up Cost Tracking

### 1. Use a Default (Zero Setup)

If your model and provider match a default, Arize AX automatically applies the correct pricing — no action needed.

### 2. Customize a Default

To tweak an existing config (e.g., apply discounts):

* Go to **Settings > Cost Tracking > Configuration**
* Click **Options > Clone** on a default config
* Edit fields like token type cost or provider name

<Frame>
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/gifs/cost_tracking_arize.gif" alt="Customizing cost tracking config in Arize AX" />
</Frame>

### 3. Create from Scratch

To define your own model config:

* Click **Add New**
* Enter the **model name** (required)
* Optionally enter the **provider**
* Specify cost per 1 million tokens for each token type
* Assign each token type to Prompt or Completion

<Info>
  Cost configs are saved at the **organization level**.
</Info>

<Frame>
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/gifs/cost_tracking_custom.gif" alt="Creating custom cost tracking config" />
</Frame>

# Using Cost Data

Once configured, cost data is available across the platform.

### Filtering and Monitoring

All cost attributes are available throughout the platform and can be used to:

* Filter traces or spans where cost exceeds a defined threshold
* Create monitors for high-cost traces or model behavior anomalies
* Build dashboards based on specific token types or cost groupings

### Trace-Level Visualization

At the trace level, Arize AX aggregates cost across all LLM spans in the trace. This provides a complete view of how much it cost to serve a given request end-to-end.

<Frame>
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/instrument/trace_cost.png" alt="Trace-level cost aggregated across all LLM spans in a request in Arize AX" />
</Frame>

### Span-Level Visualization

You can also inspect cost at the individual span level, including a breakdown by token type. This allows you to:

* Pinpoint expensive steps in the LLM pipeline
* Analyze the relative contribution of different token categories (e.g., reasoning, cache, image)

<Frame>
  <img src="https://storage.googleapis.com/arize-phoenix-assets/assets/images/arize-docs-images/instrument/session_cost.png" alt="LLM span Attributes tab in Arize AX showing llm.cost breakdown with prompt, completion, prompt_details, completion_details, and total" />
</Frame>

## Lookup Logic

To determine cost:

1. We extract the model name from your trace using the following fallback order:
   * `llm.model_name` (Primary)
   * `llm.invocation_parameters.model` (Fallback 1)
   * `metadata.model` (Fallback 2)
2. Optionally, if you provide a `provider`, we'll match that as well (e.g., differentiating OpenAI vs Azure OpenAI for `gpt-4`).
3. Each token type (e.g., prompt, completion, audio) is matched against the configuration, and the cost is calculated per million tokens (1M token unit basis).

<Warning>
  **Important:** Cost is not retroactive. To track costs, you must configure pricing before ingesting traces.
</Warning>

## Supported Token Types and Semantic Conventions

You can send any token types using [OpenInference semantic conventions](https://github.com/Arize-ai/openinference/blob/main/spec/semantic_conventions.md). Below are the supported fields:

### Prompt Tokens

| Token Type                                  | Field Name                                   |
| :------------------------------------------ | :------------------------------------------- |
| Prompt (Includes all input subtypes to LLM) | `llm.token_count.prompt`                     |
| Prompt Details                              | `llm.token_count.prompt_details`             |
| Audio                                       | `llm.token_count.prompt_details.audio`       |
| Image                                       | `llm.token_count.prompt_details.image`       |
| Cache Input                                 | `llm.token_count.prompt_details.cache_input` |
| Cache Read                                  | `llm.token_count.prompt_details.cache_read`  |
| Cache Write                                 | `llm.token_count.prompt_details.cache_write` |

### Completion Tokens

| Token Type                                         | Field Name                                     |
| :------------------------------------------------- | :--------------------------------------------- |
| Completion (Includes all output subtypes from LLM) | `llm.token_count.completion`                   |
| Audio                                              | `llm.token_count.completion_details.audio`     |
| Reasoning                                          | `llm.token_count.completion_details.reasoning` |
| Image                                              | `llm.token_count.completion_details.image`     |

### Total Tokens (Optional)

`llm.token_count.total`

### Custom Token Types

You can also define custom token types under either `prompt_details` or `completion_details`. Just make sure to:

* Use semantic naming
* Include a matching token type and cost in your configuration

Each token sent will have a cost calculated provided a matching token type is defined in your configuration.

***

## Next step

Configure your OpenTelemetry tracer for production — batch processing, routing, and resource attributes:

<Card title="Next: Configure Your Tracer" icon="arrow-right" href="/ax/instrument/configure-your-tracer" />
