Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Medical
/
Language Modelling
/
The Pile
Language Modelling on The Pile
Metric: Test perplexity (lower is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
Test perplexity (best first)
Test perplexity (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Test perplexity
▲
Extra Data
Paper
Date
↕
Code
1
Larger Transformer 771M (fine-tuned)
10
No
Need a Small Specialized Language Model? Plan Ea...
2024-02-02
-
2
Hybrid H3 125M
10.2
No
Hungry Hungry Hippos: Towards Language Modeling ...
2022-12-28
Code
3
GPT-Neo 2.7B
10.44
No
Knowledge Unlearning for Mitigating Privacy Risk...
2022-10-04
Code
4
Transformer 125M
10.7
No
Hungry Hungry Hippos: Towards Language Modeling ...
2022-12-28
Code
5
GPT-Neo 1.3B
11.46
No
Knowledge Unlearning for Mitigating Privacy Risk...
2022-10-04
Code
6
Smaller Transformer 126M (fine-tuned)
12
No
Need a Small Specialized Language Model? Plan Ea...
2024-02-02
-
7
OPT 2.7B
17.81
No
Knowledge Unlearning for Mitigating Privacy Risk...
2022-10-04
Code
8
GPT-Neo 125M
17.83
No
Knowledge Unlearning for Mitigating Privacy Risk...
2022-10-04
Code
9
OPT 1.3B
19.55
No
Knowledge Unlearning for Mitigating Privacy Risk...
2022-10-04
Code
10
Larger Transformer 771M (pre-trained)
28.1
No
Need a Small Specialized Language Model? Plan Ea...
2024-02-02
-
11
OPT 125M
32.26
No
Knowledge Unlearning for Mitigating Privacy Risk...
2022-10-04
Code
12
Smaller Transformer 126M (pre-trained)
33
No
Need a Small Specialized Language Model? Plan Ea...
2024-02-02
-
#1
Larger Transformer 771M (fine-tuned)
SOTA
10
Test perplexity
· 2024-02-02
Need a Small Specialized Language Model? Plan Early!
#2
Hybrid H3 125M
SOTA
10.2
Test perplexity
· 2022-12-28
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Code
#3
GPT-Neo 2.7B
SOTA
10.44
Test perplexity
· 2022-10-04
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Code
#4
Transformer 125M
10.7
Test perplexity
· 2022-12-28
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Code
#5
GPT-Neo 1.3B
SOTA
11.46
Test perplexity
· 2022-10-04
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Code
#6
Smaller Transformer 126M (fine-tuned)
12
Test perplexity
· 2024-02-02
Need a Small Specialized Language Model? Plan Early!
#7
OPT 2.7B
SOTA
17.81
Test perplexity
· 2022-10-04
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Code
#8
GPT-Neo 125M
SOTA
17.83
Test perplexity
· 2022-10-04
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Code
#9
OPT 1.3B
SOTA
19.55
Test perplexity
· 2022-10-04
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Code
#10
Larger Transformer 771M (pre-trained)
28.1
Test perplexity
· 2024-02-02
Need a Small Specialized Language Model? Plan Early!
#11
OPT 125M
SOTA
32.26
Test perplexity
· 2022-10-04
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Code
#12
Smaller Transformer 126M (pre-trained)
33
Test perplexity
· 2024-02-02
Need a Small Specialized Language Model? Plan Early!