II.
TestSet overview
Reference · livetest-set:bigcode-evalplus
BigCode EvalPlus overview
Canonical EvalPlus HumanEval+ release used in many post-2023 code-LLM evaluations.
Attributes
displayName
BigCode EvalPlus
benchmarkId
caseCount
164
releasedAt
2023-05-08
composition
EvalPlus extends HumanEval and MBPP with ~80x more test cases
generated via type-aware mutation, exposing functional bugs that
pass the original tests but fail under stricter scrutiny. This
entry represents the HumanEval+ portion.
homepageUrl
description
Canonical EvalPlus HumanEval+ release used in many post-2023
code-LLM evaluations.
Outgoing edges
belongs_to_benchmark1
- benchmark:bigcode-evalplus·BenchmarkEvalPlus
Incoming edges
None.