Continuously fine tuning retrieval models based on real user data can lead to a leap in retrieval accuracy. Combining this with quantization can also give a massive jump in inference speed. This isn't feasible with embedding models since a model update would require a full database migration. Caleb John, Co-Founder & CEO at Pongo, explores the benefits of regularly fine-tuned two-stage retrieval systems and methodology for doing so. This talk was originally delivered at Arize:Observe 2024 at Shack 15 in San Francisco on July 11, 2024.
Caleb John
Pongo