Trace-Level LLM Evaluations with Arize AX
Most commonly, we hear about evaluating LLM applications at the span level. This involves checking whether a tool call succeeded, whether an LLM hallucinated, or whether a response matched expectations….
2 minutes read
By Sanjana Yeddula |
2 minutes read