Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Medical
/
Language Modelling
/
Hutter Prize
Language Modelling on Hutter Prize
Metric: Bit per Character (BPC) (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
Bit per Character (BPC) (best first)
Bit per Character (BPC) (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Bit per Character (BPC)
▼
Extra Data
Paper
Date
↕
Code
1
RHN - depth 5 [zilly2016recurrent]
1.31
No
Recurrent Highway Networks
2016-07-12
Code
2
FS-LSTM-4
1.277
No
Fast-Slow Recurrent Neural Networks
2017-05-24
Code
3
Large RHN
1.27
No
Recurrent Highway Networks
2016-07-12
Code
4
Large FS-LSTM-4
1.245
No
Fast-Slow Recurrent Neural Networks
2017-05-24
Code
5
Large mLSTM +emb +WN +VD
1.24
No
Multiplicative LSTM for sequence modelling
2016-09-26
Code
6
3-layer AWD-LSTM
1.232
No
An Analysis of Neural Language Modeling at Multi...
2018-03-22
Code
7
Mogrifier LSTM
1.122
No
Mogrifier LSTM
2019-09-04
Code
8
12-layer Character Transformer Model
1.11
No
Character-Level Language Modeling with Deeper Se...
2018-08-09
Code
9
mLSTM + dynamic eval
1.08
No
Dynamic Evaluation of Neural Sequence Models
2017-09-21
Code
10
64-layer Character Transformer Model
1.06
No
Character-Level Language Modeling with Deeper Se...
2018-08-09
Code
11
12-layer Transformer-XL
1.06
Yes
Transformer-XL: Attentive Language Models Beyond...
2019-01-09
Code
12
18-layer Transformer-XL
1.03
Yes
Transformer-XL: Attentive Language Models Beyond...
2019-01-09
Code
13
Longformer Small
1
No
Longformer: The Long-Document Transformer
2020-04-10
Code
14
24-layer Transformer-XL
0.99
No
Transformer-XL: Attentive Language Models Beyond...
2019-01-09
Code
15
Longformer Large
0.99
No
Longformer: The Long-Document Transformer
2020-04-10
Code
16
Mogrifier LSTM + dynamic eval
0.988
No
Mogrifier LSTM
2019-09-04
Code
17
Compressive Transformer
0.97
No
Compressive Transformers for Long-Range Sequence...
2019-11-13
Code
18
Transformer-XL + RMS dynamic eval
0.94
No
Dynamic Evaluation of Transformer Language Models
2019-04-17
Code
#1
RHN - depth 5 [zilly2016recurrent]
SOTA
1.31
Bit per Character (BPC)
· 2016-07-12
Recurrent Highway Networks
Code
#2
FS-LSTM-4
1.277
Bit per Character (BPC)
· 2017-05-24
Fast-Slow Recurrent Neural Networks
Code
#3
Large RHN
1.27
Bit per Character (BPC)
· 2016-07-12
Recurrent Highway Networks
Code
#4
Large FS-LSTM-4
1.245
Bit per Character (BPC)
· 2017-05-24
Fast-Slow Recurrent Neural Networks
Code
#5
Large mLSTM +emb +WN +VD
1.24
Bit per Character (BPC)
· 2016-09-26
Multiplicative LSTM for sequence modelling
Code
#6
3-layer AWD-LSTM
1.232
Bit per Character (BPC)
· 2018-03-22
An Analysis of Neural Language Modeling at Multiple Scales
Code
#7
Mogrifier LSTM
1.122
Bit per Character (BPC)
· 2019-09-04
Mogrifier LSTM
Code
#8
12-layer Character Transformer Model
1.11
Bit per Character (BPC)
· 2018-08-09
Character-Level Language Modeling with Deeper Self-Attention
Code
#9
mLSTM + dynamic eval
1.08
Bit per Character (BPC)
· 2017-09-21
Dynamic Evaluation of Neural Sequence Models
Code
#10
64-layer Character Transformer Model
1.06
Bit per Character (BPC)
· 2018-08-09
Character-Level Language Modeling with Deeper Self-Attention
Code
#11
12-layer Transformer-XL
1.06
Bit per Character (BPC)
· Extra Data
· 2019-01-09
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Code
#12
18-layer Transformer-XL
1.03
Bit per Character (BPC)
· Extra Data
· 2019-01-09
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Code
#13
Longformer Small
1
Bit per Character (BPC)
· 2020-04-10
Longformer: The Long-Document Transformer
Code
#14
24-layer Transformer-XL
0.99
Bit per Character (BPC)
· 2019-01-09
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Code
#15
Longformer Large
0.99
Bit per Character (BPC)
· 2020-04-10
Longformer: The Long-Document Transformer
Code
#16
Mogrifier LSTM + dynamic eval
0.988
Bit per Character (BPC)
· 2019-09-04
Mogrifier LSTM
Code
#17
Compressive Transformer
0.97
Bit per Character (BPC)
· 2019-11-13
Compressive Transformers for Long-Range Sequence Modelling
Code
#18
Transformer-XL + RMS dynamic eval
0.94
Bit per Character (BPC)
· 2019-04-17
Dynamic Evaluation of Transformer Language Models
Code