Documentation Index Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
Time Series Evals with OpenAI o1-preview
Prompt Caching Benchmarking
Multi-Agent Systems: Swarm
Instrumenting LLMs with OTel
Comparing Agent Frameworks
Testing Generation in RAG
Time Series Evals with OpenAI o1-preview
We benchmarked o1-preview on our hardest eval task - time series trend evaluations . This post compares that performance against GPT-4o-mini, Claude 3.5 sonnet, and GPT-4o.
o1-preview Time Series EvaluationsArize AI
Prompt Caching Benchmarking
We compare the performance and cost savings of prompt caching on Anthropic vs OpenAI .
How to Make Your AI App Feel Magical: Prompt CachingArize AI
Multi-Agent Systems: Swarm
We compare and contrast OpenAI’s experimental Swarm repo against other popular multi-agent frameworks: Autogen and CrewAI
Comparing OpenAI Swarm with other Multi Agent FrameworksArize AI
Instrumenting LLMs with OTel
Lessons learned from our journey to one million downloads of our OpenTelemetry wrapper, OpenInference .
Zero to a Million: Instrumenting LLMs with OTELArize AI
Comparing Agent Frameworks
We built the same agent in LangGraph, LlamaIndex Workflows, CrewAI, Autogen, and pure code. See how each framework compares.
Comparing Agent FrameworksArize AI
Testing Generation in RAG
Testing the generation stage of RAG across GPT-4 and Claude 2.1.
Evaluating the Generation Stage in RAGArize AI