TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Medical/Language Modelling/LAMBADA

Language Modelling on LAMBADA

Metric: Accuracy (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Accuracy▼Extra DataPaperDate↕Code
1PaLM-540B (Few-Shot)89.7NoPaLM: Scaling Language Modeling with Pathways2022-04-05Code
2PaLM 2-L (one-shot)86.9NoPaLM 2 Technical Report2023-05-17Code
3GPT-3 175B (Few-Shot)86.4NoLanguage Models are Few-Shot Learners2020-05-28Code
4LLaMA-65B+CFG (Zero-Shot)84NoStay on topic with Classifier-Free Guidance2023-06-30-
5LLaMA-30B+CFG (zero-shot)83.9NoStay on topic with Classifier-Free Guidance2023-06-30-
6PaLM 2-M (one-shot)83.7NoPaLM 2 Technical Report2023-05-17Code
7Cohere Large82.33No---
8LLaMA-13B+CFG (zero-shot)82.2NoStay on topic with Classifier-Free Guidance2023-06-30-
9PaLM-540B (One-Shot)81.8NoPaLM: Scaling Language Modeling with Pathways2022-04-05Code
10GLaM 62B/64E (One-Shot)80.9NoGLaM: Efficient Scaling of Language Models with ...2021-12-13-
11PaLM 2-S (one-shot)80.7NoPaLM 2 Technical Report2023-05-17Code
12GLM-130B (bidirectional attention)80.2NoGLM-130B: An Open Bilingual Pre-trained Model2022-10-05Code
13SparseGPT (175B, 2:4 Sparsity)79.47NoSparseGPT: Massive Language Models Can Be Accura...2023-01-02Code
14SparseGPT (175B, 4:8 Sparsity)78.77NoSparseGPT: Massive Language Models Can Be Accura...2023-01-02Code
15PaLM-540B (Zero-Shot)77.9NoPaLM: Scaling Language Modeling with Pathways2022-04-05Code
16Chinchilla (Zero-Shot)77.7NoTraining Compute-Optimal Large Language Models2022-03-29Code
17SparseGPT (175B, 50% Sparsity)76.51NoSparseGPT: Massive Language Models Can Be Accura...2023-01-02Code
18GPT-3 175B (Zero-Shot)76.2NoLanguage Models are Few-Shot Learners2020-05-28Code
19OPT-175B75.59NoSparseGPT: Massive Language Models Can Be Accura...2023-01-02Code
20GPT-3 13B (Zero-Shot)72.5NoLanguage Models are Few-Shot Learners2020-05-28Code
21GLM-XXLarge (bidirectional)72.35NoGLM: General Language Model Pretraining with Aut...2021-03-18Code
22Pythia 12B (0-shot)70.46NoPythia: A Suite for Analyzing Large Language Mod...2023-04-03Code
23GPT-3 6.7B (Zero-Shot)70.3NoLanguage Models are Few-Shot Learners2020-05-28Code
24GPT-J-6B69.7No---
25Mamba-2.8B69.2NoMamba: Linear-Time Sequence Modeling with Select...2023-12-01Code
26Pythia 6.9B (0-shot)67.28NoPythia: A Suite for Analyzing Large Language Mod...2023-04-03Code
27GLM-XXLarge (unidirectional)67.18NoGLM: General Language Model Pretraining with Aut...2021-03-18Code
28GPT-3 2.7B (Zero-Shot)67.1NoLanguage Models are Few-Shot Learners2020-05-28Code
29GPT-2 1.5B (Zero Shot)63.24No--Code
30Universal Transformer (w/ dynamic halting)56.25NoUniversal Transformers2018-07-10Code
31Residual Shuffle-Exchange network54.34NoResidual Shuffle-Exchange Networks for Fast Proc...2020-04-06Code
32Gated-Attention Reader (+ features)49NoBroad Context Language Modeling as Reading Compr...2016-10-26-
33OPT-175B (50% Sparsity)0.02NoSparseGPT: Massive Language Models Can Be Accura...2023-01-02Code
34test0.01NoTest-Time Training with Self-Supervision for Gen...2019-09-29Code