This event has passed.

Evaluating LLMs: Needle in a Haystack

February 13, 2024 @ 5:30 pm – 8:30 pm PST

LLM evaluation is a discipline where confusion reigns and foundation model builders are effectively grading their own homework.

Building on the viral threads on X/Twitter, Greg Kamradt, Robert Nishihara, and Jason Lopatecki discuss highlights from Arize AI’s ongoing research on how major foundation models – from OpenAI’s GPT-4 to Mistral and Anthropic’s Claude – are stacking up against each other at important tasks and emerging LLM use cases, covering and explaining the importance of results of Needle in a Haystack tests and other evals results on hallucination detection on private data, question-and-answer, code functionality, and more.

Curious which foundation models your company should be using for a specific use case – and which to avoid? You won’t want to miss this meetup!

Details

Date: February 13, 2024
Time:
5:30 pm – 8:30 pm PST
Website: https://lu.ma/llm-haystack

Venue

San Francisco
United States + Google Map

Organizer

Arize AI & Anyscale

Arize AX

Learn

Insights

Company

Arize AX

Learn

Insights

Company

Evaluating LLMs: Needle in a Haystack

Details

Venue

Organizer

Arize AX

Learn

Insights

Company

Evaluating LLMs: Needle in a Haystack

Details

Venue

Organizer

Subscribe to The Evaluator