Arize AI & Anyscale

  1. Events
  2. Organizers
  3. Arize AI & Anyscale
Events from this organizer
Today

Evaluating LLMs: Needle in a Haystack

San Francisco , United States

​LLM evaluation is a discipline where confusion reigns and foundation model builders are effectively grading their own homework. ​Building on the viral threads on X/Twitter,  Greg Kamradt, Robert Nishihara, and Jason Lopatecki discuss highlights from Arize AI's ongoing research on how major foundation models – from OpenAI’s GPT-4 to Mistral and Anthropic’s Claude – are stacking up...