AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving Paper • 2506.12508 • Published Jun 14, 2025 • 1
Trae Agent: An LLM-based Agent for Software Engineering with Test-time Scaling Paper • 2507.23370 • Published Jul 31, 2025
MCP-AgentBench: Evaluating Real-World Language Agent Performance with MCP-Mediated Tools Paper • 2509.09734 • Published Sep 10, 2025 • 16
LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools? Paper • 2508.01780 • Published Aug 3, 2025 • 21
API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs Paper • 2304.08244 • Published Apr 14, 2023 • 1