Events Archive

January 2024
AI for Ukraine

January 18, 2024 @ 9:00 am – 10:30 am PST

Virtual , United States
Troubleshooting Large Language Models – an Interview with Amber Roberts

January 19, 2024 @ 9:00 am – 10:00 am PST

Virtual , United States
Community Paper Reading: Phi-2 – Small Language Models (SLMs) vs. LLMs

January 24, 2024 @ 10:15 am – 11:00 am PST

Virtual , United States
RAG Time! Evaluate RAG with LLM Evals and Benchmarking

January 31, 2024 @ 10:00 am – 10:45 am PST

Virtual , United States
February 2024
Community Paper Reading: RAG vs Fine-tuning

February 7, 2024 @ 10:15 am – 11:00 am PST

Virtual , United States
Rocky Mountain AI Interest Group: The ABCs of GPTs

February 7, 2024 @ 6:30 pm – 8:30 pm PST

CU Boulder
Evaluating LLMs: Needle in a Haystack

February 13, 2024 @ 5:30 pm – 8:30 pm PST

San Francisco , United States

LLM evaluation is a discipline where confusion reigns and foundation model builders are effectively grading their own homework. Building on the viral threads on X/Twitter, Greg Kamradt, Robert Nishihara, and Jason Lopatecki discuss highlights from Arize AI's ongoing research on how major foundation models – from OpenAI’s GPT-4 to Mistral and Anthropic’s Claude – are stacking up...
Path to Production: LLM System Evaluations and Observability

February 20, 2024 @ 5:30 pm – 8:30 pm PST

San Francisco , United States
Community Paper Reading: Exploring Sora & Evaluating Large Video Generation Models

February 21, 2024 @ 10:15 am – 11:00 am PST

Virtual , United States
DevOps for GenAI Hackathon SF by MongoDB

February 22, 2024 @ 9:30 am – 5:30 pm PST

San Francisco , United States
March 2024
Community Paper Reading: Reinforcement Learning in the Era of LLMs

March 13, 2024 @ 10:15 am – 11:00 am PDT

Virtual , United States
Community Paper Reading: Claude-3

March 20, 2024 @ 10:15 am – 11:00 am PDT

Virtual , United States

Arize AX

Learn

Insights

Company

Arize AX

Learn

Insights

Company

January 2024

AI for Ukraine

Troubleshooting Large Language Models – an Interview with Amber Roberts

Community Paper Reading: Phi-2 – Small Language Models (SLMs) vs. LLMs

RAG Time! Evaluate RAG with LLM Evals and Benchmarking

February 2024

Community Paper Reading: RAG vs Fine-tuning

Rocky Mountain AI Interest Group: The ABCs of GPTs

Evaluating LLMs: Needle in a Haystack

Path to Production: LLM System Evaluations and Observability

Community Paper Reading: Exploring Sora & Evaluating Large Video Generation Models

DevOps for GenAI Hackathon SF by MongoDB

March 2024

Community Paper Reading: Reinforcement Learning in the Era of LLMs

Community Paper Reading: Claude-3

Arize AX

Learn

Insights

Company

January 2024

February 2024

March 2024

Subscribe to The Evaluator