Agentic AI Atlasby a5c.ai
OverviewWikiGraphFor AgentsEdgesSearchWorkspace
/
GitHubDocsDiscord
iiRecord
Agentic AI Atlas · HumanEval
benchmark:human-evala5c.ai
Search record views/
Record · tabs

Available views

II.Record viewspp. 1 - 1
overviewjsongraph
II.
Benchmark overview

benchmark:human-eval

Reference · live

HumanEval overview

Hand-written programming problems for evaluating code generation.

BenchmarkOutgoing · 1Incoming · 20

Attributes

displayName
HumanEval
homepageUrl
https://github.com/openai/human-eval
kind
function-completion
targetsKind
ModelVersion
description
Hand-written programming problems for evaluating code generation.

Outgoing edges

covers1
  • skill-area:python-implementation·SkillAreaPython Function Implementation

Incoming edges

belongs_to_benchmark1
  • test-set:humaneval-original·TestSetHumanEval original problem set
bounds_subject1
  • scope-boundary:human-eval.scope·ScopeBoundary
for_benchmark9
  • eval-run:human-eval.qwen-2-5-72b.2024-09·EvalRun
  • eval-run:human-eval.qwen-2-5-coder-32b.2024-11·EvalRun
  • eval-run:human-eval.claude-sonnet-4-6.2025-11·EvalRun
  • eval-run:human-eval.deepseek-v3.2024-12·EvalRun
  • eval-run:human-eval.llama-3-1-405b.2024-07·EvalRun
  • eval-run:human-eval.llama-3-3-70b.2024-12·EvalRun
  • eval-run:human-eval.mistral-large-2.2024-07·EvalRun
  • eval-run:human-eval.codestral-25-01.2025-01·EvalRun
  • eval-run:human-eval.gpt-5.2025-08·EvalRun
scored_against9
  • eval-result:human-eval.qwen-2-5-72b.001·EvalResult
  • eval-result:human-eval.qwen-2-5-coder-32b.001·EvalResult
  • eval-result:human-eval.claude-sonnet-4-6.001·EvalResult
  • eval-result:human-eval.deepseek-v3.001·EvalResult
  • eval-result:human-eval.llama-3-1-405b.001·EvalResult
  • eval-result:human-eval.llama-3-3-70b.001·EvalResult
  • eval-result:human-eval.mistral-large-2.001·EvalResult
  • eval-result:human-eval.codestral-25-01.001·EvalResult
  • eval-result:human-eval.gpt-5.001·EvalResult

Related pages

No related wiki pages for this record.

Shortcuts

Open in graph
Browse node kind