II.
Benchmark overview
Reference · livebenchmark:berkeley-function-calling
Berkeley Function Calling Leaderboard (BFCL) overview
BFCL (Berkeley Function Calling Leaderboard, from the Gorilla project) is the canonical public leaderboard for LLM function- calling and tool-use accuracy across simple, parallel, multiple, and live function-calling categories. Versions v1, v2 (live), and v3 (multi-turn / multi-step) have been released.
Attributes
displayName
Berkeley Function Calling Leaderboard (BFCL)
homepageUrl
kind
model-only
targetsKind
ModelVersion
description
BFCL (Berkeley Function Calling Leaderboard, from the Gorilla
project) is the canonical public leaderboard for LLM function-
calling and tool-use accuracy across simple, parallel, multiple,
and live function-calling categories. Versions v1, v2 (live),
and v3 (multi-turn / multi-step) have been released.
Outgoing edges
covers1
- skill-area:tool-use·SkillAreaLLM Tool Use
Incoming edges
for_benchmark2
scored_against2
- eval-result:bfcl.claude-sonnet-4-5.001·EvalResult
- eval-result:bfcl.gpt-5.001·EvalResult