AgentBench is a benchmarking suite designed to evaluate the performance of LLMs acting as agents across various interactive environments and tasks. It encompasses set of interactive environments and tasks that test various agent capabilities, such as: reasoning, decision-making and collaboration.
What is AgentBench?

AgentBench

Bi-weekly AI Research Paper Readings
Stay on top of emerging trends and frameworks.