What is AgentBench?

AgentBench

AgentBench is a benchmarking suite designed to evaluate the performance of LLMs acting as agents across various interactive environments and tasks. It encompasses set of interactive environments and tasks that test various agent capabilities, such as: reasoning, decision-making and collaboration.

Bi-weekly AI Research Paper Readings

Stay on top of emerging trends and frameworks.

View Research Papers

Arize AX

Learn

Insights

Company

Arize AX

Learn

Insights

Company

What is AgentBench?

AgentBench

Bi-weekly AI Research Paper Readings

Arize AX

Learn

Insights

Company

What is AgentBench?

AgentBench

Bi-weekly AI Research Paper Readings

Subscribe to The Evaluator