Question Answering on TruthfulQA

Metric: % true (higher is better)

LeaderboardDataset

Loading chart...

Results

Submit a result

#	Model↕	% true▼	Extra Data	Paper	Date↕	Code
1	Vicuna 7B + Inference Time Intervention (ITI)	88.6	No	-	-	-
2	Alpaca 7B + Inference Time Intervention (ITI)	66.6	No	-	-	-
3	LLaMA 65B	57	No	LLaMA: Open and Efficient Foundation Language Mo...	2023-02-27	Code
4	UnifiedQA 3B	53.86	No	TruthfulQA: Measuring How Models Mimic Human Fal...	2021-09-08	Code
5	LLaMA 33B	52	No	LLaMA: Open and Efficient Foundation Language Mo...	2023-02-27	Code
6	LLaMA 13B	47	No	LLaMA: Open and Efficient Foundation Language Mo...	2023-02-27	Code
7	LLaMA 7B + Inference Time Intervention (ITI)	45.1	No	-	-	-
8	LLaMA 7B	33	No	LLaMA: Open and Efficient Foundation Language Mo...	2023-02-27	Code
9	GPT-2 1.5B	29.5	No	TruthfulQA: Measuring How Models Mimic Human Fal...	2021-09-08	Code
10	GPT-J 6B	26.68	No	TruthfulQA: Measuring How Models Mimic Human Fal...	2021-09-08	Code
11	GPT-3 175B	20.44	No	TruthfulQA: Measuring How Models Mimic Human Fal...	2021-09-08	Code