iiRecord
Agentic AI Atlas · AgentBench multi-environment suite
test-set:agentbench-environmentsa5c.ai
II.
TestSet overview

test-set:agentbench-environments

Reference · live

AgentBench multi-environment suite overview

Canonical AgentBench artifact for broad LLM-as-agent evaluation.

TestSetOutgoing · 1Incoming · 0

Attributes

displayName
AgentBench multi-environment suite
benchmarkId
environmentCount
8
releasedAt
2023-08-07
composition
Multi-environment LLM-as-agent benchmark covering eight interactive environments for reasoning, decision-making, tool use, and instruction following.
homepageUrl
description
Canonical AgentBench artifact for broad LLM-as-agent evaluation.

Outgoing edges

belongs_to_benchmark1

Incoming edges

None.