II.
Benchmark overview
Reference · livebenchmark:toolbench
ToolBench overview
ToolBench (OpenBMB) is a large-scale instruction-tuning and evaluation suite for LLM tool-use, built on 16,000+ real-world REST APIs from RapidAPI; the companion ToolEval harness scores pass-rate and win-rate against a reference tool-using agent.
Attributes
displayName
ToolBench
homepageUrl
kind
agent-platform
targetsKind
AgentVersion
description
ToolBench (OpenBMB) is a large-scale instruction-tuning and
evaluation suite for LLM tool-use, built on 16,000+ real-world
REST APIs from RapidAPI; the companion ToolEval harness scores
pass-rate and win-rate against a reference tool-using agent.
Outgoing edges
applies_to2
- domain:software-engineering·DomainSoftware Engineering
- domain:api-development·DomainAPI Development
covers2
- skill-area:tool-use·SkillAreaLLM Tool Use
- skill-area:multi-turn-tool-use·SkillAreaMulti-Turn Tool Use
Incoming edges
belongs_to_benchmark1
- test-set:toolbench-tooleval·TestSetToolBench ToolEval suite