Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Medical
/
Language Modelling
/
The Pile
Language Modelling on The Pile
Metric: Test perplexity (lower is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
#
Model
↕
Test perplexity
▲
Extra Data
Paper
Date
↕
Code
1
Larger Transformer 771M (fine-tuned)
10
No
Need a Small Specialized Language Model? Plan Ea...
2024-02-02
-
2
Hybrid H3 125M
10.2
No
Hungry Hungry Hippos: Towards Language Modeling ...
2022-12-28
Code
3
GPT-Neo 2.7B
10.44
No
Knowledge Unlearning for Mitigating Privacy Risk...
2022-10-04
Code
4
Transformer 125M
10.7
No
Hungry Hungry Hippos: Towards Language Modeling ...
2022-12-28
Code
5
GPT-Neo 1.3B
11.46
No
Knowledge Unlearning for Mitigating Privacy Risk...
2022-10-04
Code
6
Smaller Transformer 126M (fine-tuned)
12
No
Need a Small Specialized Language Model? Plan Ea...
2024-02-02
-
7
OPT 2.7B
17.81
No
Knowledge Unlearning for Mitigating Privacy Risk...
2022-10-04
Code
8
GPT-Neo 125M
17.83
No
Knowledge Unlearning for Mitigating Privacy Risk...
2022-10-04
Code
9
OPT 1.3B
19.55
No
Knowledge Unlearning for Mitigating Privacy Risk...
2022-10-04
Code
10
Larger Transformer 771M (pre-trained)
28.1
No
Need a Small Specialized Language Model? Plan Ea...
2024-02-02
-
11
OPT 125M
32.26
No
Knowledge Unlearning for Mitigating Privacy Risk...
2022-10-04
Code
12
Smaller Transformer 126M (pre-trained)
33
No
Need a Small Specialized Language Model? Plan Ea...
2024-02-02
-