TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Question Answering/TruthfulQA

Question Answering on TruthfulQA

Metric: % true (GPT-judge) (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕% true (GPT-judge)▼Extra DataPaperDate↕Code
1UnifiedQA 3B53.24NoTruthfulQA: Measuring How Models Mimic Human Fal...2021-09-08Code
2GPT-2 1.5B29.87NoTruthfulQA: Measuring How Models Mimic Human Fal...2021-09-08Code
3GPT-J 6B27.17NoTruthfulQA: Measuring How Models Mimic Human Fal...2021-09-08Code
4GPT-3 175B20.56NoTruthfulQA: Measuring How Models Mimic Human Fal...2021-09-08Code