LLM Evaluations: SQL Generation and Router-Based Architectures

May 9th & 16th

10:00am PST – 10:45am PST

Virtual

Join Arize AI’s Co-Founders for a virtual event dedicated to exploring the latest frontiers in evaluating large language models (LLMs) for complex tasks. This event will feature two insightful sessions, each delving into a unique and exciting application of LLM evaluation:

Session 1 | SQL Generation Evals: LLMs-as-a-Judge

LLM-as-a-Judge is a popular and scalable technique to evaluate LLMs for tasks including toxicity classification, sentiment classifier, and text-to-SQL tasks. However, LLM-as-a-Judge based evaluation has certain limitations and points of contention – circular methodology (using 1 LLM to evaluate another LLM) and disregard for database schema or distribution. In this session, we will discuss an experiment we designed to evaluate the performance of the LLM-as-a-Judge Eval for text-to-SQL tasks. We’ll take you through a framework to compare LLM-as-a-Judge approach with a data distribution-based Eval approach for text-to-SQL tasks. We will also discuss some interesting cases that came up in our research highlighting the pitfalls of LLM-as-a-Judge approach and some suggestions on how this approach can be enhanced to account for those limitations.

Session 2 | LLM Evals for Router-Based Architectures

The second session in our series will explore how to effectively evaluate large language models (LLMs) within router-based AI architectures. Router networks allow for the dynamic routing of inputs to specialized LLM components, enabling more efficient and capable systems. However, evaluating the performance of these complex architectures presents unique challenges. In this session, we’ll cover key considerations and best practices for LLM evaluation in router setups.

Arize AX

Arize Phoenix

Learn

Insights

Company

Arize AX

Arize Phoenix

Learn

Insights

Company

Webinar

LLM Evaluations: SQL Generation and Router-Based Architectures

Save Your Spot

Speakers

Jason Lopatecki

Co-Founder & CEO, Arize AI

Manas Singh

MBA Candidate, UC Berkeley

Aparna Dhinakaran

Co-Founder & CPO, Arize AI

Dat Ngo

Strategic Solutions Architect

Get LLM and ML observability in minutes.

Get Started

Arize AX

Arize Phoenix

Learn

Insights

Company

Webinar

LLM Evaluations: SQL Generation and Router-Based Architectures

Save Your Spot

Speakers

Jason Lopatecki

Co-Founder & CEO, Arize AI

Manas Singh

MBA Candidate, UC Berkeley

Aparna Dhinakaran

Co-Founder & CPO, Arize AI

Dat Ngo

Strategic Solutions Architect

Get LLM and ML observability in minutes.

Get Started

Subscribe to The Evaluator