The Evaluator
Your go-to blog for insights on AI observability and evaluation.
Google Antigravity and Arize AX’s MCP Tracing Assistant: How to Trace Your Agent Without Writing Any Code
TL;DR: Add the Arize AX MCP server to Antigravity to instrument your AI applications without leaving your IDE. Instrumenting AI applications with tracing and observability is critical for debugging, monitoring,…
How Context Graphs Turn Agent Traces Into Durable Business Assets
In their recent essay making the rounds, Foundation Capital’s Jaya Gupta and Ashu Garg argue that the next enterprise data advantage will come from capturing decision traces and stitching them…
New In Arize AX: Multi-Span Filters and Improved Playground Views
Arize AX released a raft of new updates to close out December of 2025. From improved playground views to multi-span filters, here’s are some highlights. Multi-Span Filters Filter traces using…
Sign up for our newsletter, The Evaluator — and stay in the know with updates and new resources:
EU AI Act Compliance: What AI Engineering Teams Should Monitor
The EU AI Act is no longer a distant regulatory concept; it is in force and enterprises are road testing their real-world implementation. The core law is Regulation (EU) 2024/1689,…
How TheFork Leverages Online Evals To Boost Conversions with Arize AX on AWS
TheFork is one of Europe’s leading restaurant discovery and booking platforms, connecting millions of diners with tens of thousands of restaurants across major cities. The company’s marketplace spans everything from…
New In Arize AX: OpenInference TypeScript 2.0, Session Annotations, Integrations Revamp
Arize AX released a flurry of updates in November of 2025. From OpenInference TypeScript 2.0 to a revamp of integrations, there is a lot to catch up on. OpenInference TypeScript…
AWS Bedrock AgentCore Observability with Arize AX: Operationalizing AI Agents At Scale
Building an AI agent in a notebook is straightforward. Getting that agent to run reliably at scale is a different challenge entirely. Most teams hit the same production walls: agents…
Google TUMIX AI Agent Paper, Explained By Its Author
In our latest paper reading, we had the pleasure of featuring Yongchao Chen — a Research Scientist Intern at Google and PhD candidate at MIT and Harvard. He covered his…
CLAUDE.md: Best Practices Learned from Optimizing Claude Code with Prompt Learning
In our last post on Prompt Learning (our prompt optimization feature), we optimized Cline, a powerful coding agent, through its system prompt. This time, we used it on one that…
How To Improve AI Agent Security with Microsoft’s AI Red Teaming Agent in Microsoft Foundry
Building safe AI isn’t optional anymore. Every model deployed to production faces adversarial users trying to make it behave badly. Microsoft Foundry gives you automated red teaming – essentially a…