Phoenix: How To Run An LLM Evaluation

This demo covers how to run custom LLM evaluations in Phoenix using an LLM judge approach for a function calling agent. It explains the process of setting up data frames, different evaluation methods like LLM generate and classify, and exporting results back to the Phoenix UI. 📓Notebook: https://colab.research.google.com/gist/PubliusAu/be1fd140aa4de1491bfa6ca5859464ca/bring-your-own-evaluator-phoenix-example.ipynb#scrollTo=is3clylxi_XI 🔗 Other Handy Links Arize Phoenix: https://phoenix.arize.com/ How to bring your own evaluator: https://docs.arize.com/phoenix/evaluation/how-to-evals/bring-your-own-evaluator Follow John Gilhuly: https://www.linkedin.com/in/john-gilhuly-25a15888/ Join community to ask questions: https://join.slack.com/t/arize-ai/shared_invite/zt-26zg4u3lw-OjUNoLvKQ2Yv53EfvxW6Kg ⭐️ Star Phoenix on GitHub: https://github.com/Arize-ai/phoenix ⏸️ Timestamps: 00:00 Introduction to Custom Eval in Phoenix 2:21 Generating Data to Evaluate 5:06 Running Evaluations with Phoenix

Subscribe to our resources and blogs

Subscribe