How to Efficiently Fine-Tune and Serve OSS LLMs

Most teams start with a commercial LLM like GPT-4 and the appeal is pretty clear: users gain instant access to a large language model with trillions of parameters that shows great performance out of the box and beats almost all open source alternatives. However, the story changes once open-source LLMs are fine-tuned, where they not only consistently beat GPT-4 by 4-15 points on downstream tasks, but do it while being 250x smaller and with a 100x cheaper inference cost. Arnav Garg, Machine Learning Engineer at Predibase, discusses how one can efficiently fine-tune and serve OSS LLMs using open source packages like Ludwig and LoRAX with a focus on core fine-tuning and inference ideas that enable faster, cheaper, smaller and better models compared to closed source providers. This talk was originally delivered at Arize:Observe 2024 at Shack 15 in San Francisco on July 11, 2024.

Arnav Garg

Predibase

Arize AX

Arize Phoenix

Learn

Insights

Company

Arize AX

Arize Phoenix

Learn

Insights

Company

How to Efficiently Fine-Tune and Serve OSS LLMs

Arnav Garg

Subscribe to our resources and blogs

Arize AX

Arize Phoenix

Learn

Insights

Company

How to Efficiently Fine-Tune and Serve OSS LLMs

Arnav Garg

Arnav Garg

Subscribe to our resources and blogs

Subscribe to The Evaluator