Question Answering on TruthfulQA

Metric: ROUGE (higher is better)

LeaderboardDataset

Loading chart...

Results

Submit a result

Sort:

#	Model↕	ROUGE▼	Extra Data	Paper	Date↕	Code
1	UnifiedQA 3B	1.76	No	TruthfulQA: Measuring How Models Mimic Human Fal...	2021-09-08	Code
2	GPT-2 1.5B	-9.41	No	TruthfulQA: Measuring How Models Mimic Human Fal...	2021-09-08	Code
3	GPT-J 6B	-11.35	No	TruthfulQA: Measuring How Models Mimic Human Fal...	2021-09-08	Code
4	GPT-3 175B	-17.75	No	TruthfulQA: Measuring How Models Mimic Human Fal...	2021-09-08	Code

#1UnifiedQA 3BSOTA
1.76
ROUGE· 2021-09-08
TruthfulQA: Measuring How Models Mimic Human Falsehoods Code
#2GPT-2 1.5B
-9.41
ROUGE· 2021-09-08
TruthfulQA: Measuring How Models Mimic Human Falsehoods Code
#3GPT-J 6B
-11.35
ROUGE· 2021-09-08
TruthfulQA: Measuring How Models Mimic Human Falsehoods Code
#4GPT-3 175B
-17.75
ROUGE· 2021-09-08
TruthfulQA: Measuring How Models Mimic Human Falsehoods Code