Metric: MC2 (higher is better)
| # | Model↕ | MC2▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | Mistral-7B-Instruct-v0.2 + TruthX | 0.75 | No | TruthX: Alleviating Hallucinations by Editing La... | 2024-02-27 | Code |
| 2 | LLaMa-2-7B-Chat + TruthX | 0.74 | No | TruthX: Alleviating Hallucinations by Editing La... | 2024-02-27 | Code |
| 3 | GPT-2 1.5B | 0.39 | No | TruthfulQA: Measuring How Models Mimic Human Fal... | 2021-09-08 | Code |
| 4 | GPT-J 6B | 0.36 | No | TruthfulQA: Measuring How Models Mimic Human Fal... | 2021-09-08 | Code |
| 5 | UnifiedQA 3B | 0.35 | No | TruthfulQA: Measuring How Models Mimic Human Fal... | 2021-09-08 | Code |
| 6 | GPT-3 175B | 0.33 | No | TruthfulQA: Measuring How Models Mimic Human Fal... | 2021-09-08 | Code |