II.
TestSet overview
Reference · livetest-set:hellaswag-validation
HellaSwag validation overview
Standard HellaSwag validation split. Nearly all reported HellaSwag numbers in modern model cards refer to this set.
Attributes
displayName
HellaSwag validation
benchmarkId
caseCount
10042
releasedAt
2019-05-19
composition
The validation split of HellaSwag (10,042 multiple-choice
sentence-completion items adversarially filtered against an
ELMO/BERT discriminator). The test split labels are not public,
so most evaluations report on validation.
homepageUrl
description
Standard HellaSwag validation split. Nearly all reported HellaSwag
numbers in modern model cards refer to this set.
Outgoing edges
belongs_to_benchmark1
- benchmark:hellaswag·BenchmarkHellaSwag
Incoming edges
None.