AI that improves itself.

See what we shipped at Observe

Events

LLM Evaluation Essentials

On Demand

Virtual

Session 1: Benchmarking and Analyzing Retrieval Approaches

Session 2: Statistical Analysis of Summarization LLM Evaluations

Session 3: Statistical Analysis of Hallucination LLM Evaluations

Step into the world of LLM evaluations with a 3-part series dedicated to achieving production excellence. We’ll unpack advanced evaluation techniques and best practices formulated through rigorous testing — spanning retrieval, summarization, and hallucination — to help ensure production readiness. A must-attend for AI & ML engineers and data scientists. This series will cover:

Binary LLM performance evaluation and its benefits
Golden datasets and how to use them
Statistical analysis of performance of GPT-4, GPT 3.5 and more
Best practices for LLM evals

Arize AX

Learn

Insights

Company

Arize AX

Learn

Insights

Company

Events

LLM Evaluation Essentials

Access the Series

Speakers

Jason Lopatecki

CEO and Co-Founder

Aparna Dhinakaran

CPO and Co-Founder

Jerry Liu

CEO & Co-Founder, LlamaIndex

Dat Ngo

ML Solutions Architect

Trevor LaViale

ML Solutions Engineer

Arize AX

Learn

Insights

Company

Events

LLM Evaluation Essentials

Access the Series

Speakers

Jason Lopatecki

CEO and Co-Founder

Aparna Dhinakaran

CPO and Co-Founder

Jerry Liu

CEO & Co-Founder, LlamaIndex

Dat Ngo

ML Solutions Architect

Trevor LaViale

ML Solutions Engineer

Subscribe to The Evaluator