LLM Evaluation Essentials

  On Demand


Session 1 (10/3): Benchmarking and Analyzing Retrieval Approaches

Session 2 (10/10): Statistical Analysis of Summarization LLM Evaluations

Session 3 (10/16): Statistical Analysis of Hallucination LLM Evaluations

Step into the world of LLM evaluations with a 3-part series dedicated to achieving production excellence. We’ll unpack advanced evaluation techniques and best practices formulated through rigorous testing — spanning retrieval, summarization, and hallucination — to help ensure production readiness. A must-attend for AI & ML engineers and data scientists. This series will cover:

  • Binary LLM performance evaluation and its benefits
  • Golden datasets and how to use them
  • Statistical analysis of performance of GPT-4, GPT 3.5 and more
  • Best practices for LLM evals

Access the Series


Jason Lopatecki
CEO and Co-Founder

Aparna Dhinakaran
CPO and Co-Founder

Jerry Liu
CEO & Co-Founder, LlamaIndex

Dat Ngo
ML Solutions Architect

Trevor LaViale
ML Solutions Engineer

Get ML observability in minutes.

Get Started