Metric: Perplexity (lower is better)
| # | Model↕ | Perplexity▲ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | GPT-3 175B (Few-Shot) | 1.92 | No | Language Models are Few-Shot Learners | 2020-05-28 | Code |
| 2 | GPT-3 175B (Zero-Shot) | 3 | No | Language Models are Few-Shot Learners | 2020-05-28 | Code |
| 3 | GPT-3 13B (Zero-Shot) | 3.56 | No | Language Models are Few-Shot Learners | 2020-05-28 | Code |
| 4 | Pythia 12B(Zero-Shot) | 3.92 | No | Pythia: A Suite for Analyzing Large Language Mod... | 2023-04-03 | Code |
| 5 | GPT-J-6B | 3.99 | No | - | - | - |
| 6 | GPT-3 6.7B (Zero-Shot) | 4 | No | Language Models are Few-Shot Learners | 2020-05-28 | Code |
| 7 | Mamba-2.8B | 4.23 | No | Mamba: Linear-Time Sequence Modeling with Select... | 2023-12-01 | Code |
| 8 | Pythia 6.9B(Zero-Shot) | 4.45 | No | Pythia: A Suite for Analyzing Large Language Mod... | 2023-04-03 | Code |
| 9 | GPT-3 2.7B (Zero-Shot) | 4.6 | No | Language Models are Few-Shot Learners | 2020-05-28 | Code |
| 10 | GPT-2 1.5B (Zero Shot) | 8.63 | No | - | - | Code |