Cost Tracking

Cost tracking is essential for understanding and managing the spend of your LLM-powered applications. Arize AX provides a flexible, powerful, and easy-to-configure system to track model usage costs across providers and model variants — whether you're using default pricing or defining custom rates.

How Cost Tracking Works

When spans are ingested, we look up the matching cost configuration using the model name and provider, then calculate costs based on the token usage in your spans.

Lookup Logic

To determine cost:

We extract the model name from your span using the following fallback order:

  • llm.model_name (Primary)

  • llm.invocation_parameters.model (Fallback 1)

  • metadata.model (Fallback 2)

We extract the provider from your span using llm.provider (if present).

Each token type (e.g., prompt, completion, audio) is matched against the configuration, and the cost is calculated per million tokens (1M token unit basis).

Important: Cost is not retroactive. To track costs, you must configure pricing before ingesting traces.

Provider Matching Behavior

Provider is optional when configuring a cost config, but it's important to understand how matching works:

Your Cost Config
Incoming Span
Result

Model + Provider

Model + same provider

✅ Match

Model + Provider

Model + different provider

❌ No match

Model + Provider

Model + no provider

❌ No match

Model only (no provider)

Model + any provider

✅ Match

Model only (no provider)

Model + no provider

✅ Match

⚠️ Important: If you configure a provider on your cost config, it will only match spans that have that exact provider value. Spans without provider data will not match provider-specific configs.

Supported Token Types and Semantic Conventions

You can send any token types using OpenInference semantic conventions. Below are the supported fields:

Prompt Tokens

Token Type
Field Name

Prompt (Includes all input subtypes to LLM)

llm.token_count.prompt

Prompt Details

llm.token_count.prompt_details

Audio

llm.token_count.prompt_details.audio

Image

llm.token_count.prompt_details.image

Cache Input

llm.token_count.prompt_details.cache_input

Cache Read

llm.token_count.prompt_details.cache_read

Cache Write

llm.token_count.prompt_details.cache_write

Completion Tokens

Token Type
Field Name

Completion (Includes all output subtypes from LLM)

llm.token_count.completion

Audio

llm.token_count.completion_details.audio

Reasoning

llm.token_count.completion_details.reasoning

Image

llm.token_count.completion_details.image

Total Tokens (Optional)

llm.token_count.total

Custom Token Types

You can also define custom token types under either prompt_details or completion_details. Just make sure to:

  • Use semantic naming

  • Include a matching token type and cost in your configuration

Each token sent will have a cost calculated provided a matching token type is defined in your configuration.

Last updated

Was this helpful?