Haziqa Said, Author at Arize AI

Alyx 2.0 - Cursor-like agent workflows

Learn more

Haziqa Said

Follow Haziqa Said:

Recent posts by Haziqa Said

40 Large Language Model Benchmarks and The Future of Model Evaluation

With the accelerated development of GenAI, there is a particular focus on its testing and evaluation, resulting in the release of several LLM benchmarks. Each of these benchmarks tests the…

17 minutes read

Large Language Models LLM Evals

Arize AX

Learn

Insights

Company

Arize AX

Learn

Insights

Company

Haziqa Said

Recent posts by Haziqa Said

40 Large Language Model Benchmarks and The Future of Model Evaluation

Arize AX

Learn

Insights

Company

Haziqa Said

Recent posts by Haziqa Said

40 Large Language Model Benchmarks and The Future of Model Evaluation

Subscribe to The Evaluator