Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Medical
/
Language Modelling
/
Hutter Prize
Language Modelling on Hutter Prize
Metric: Bit per Character (BPC) (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
Bit per Character (BPC)
▼
Extra Data
Paper
Date
↕
Code
1
RHN - depth 5 [zilly2016recurrent]
1.31
No
Recurrent Highway Networks
2016-07-12
Code
2
FS-LSTM-4
1.277
No
Fast-Slow Recurrent Neural Networks
2017-05-24
Code
3
Large RHN
1.27
No
Recurrent Highway Networks
2016-07-12
Code
4
Large FS-LSTM-4
1.245
No
Fast-Slow Recurrent Neural Networks
2017-05-24
Code
5
Large mLSTM +emb +WN +VD
1.24
No
Multiplicative LSTM for sequence modelling
2016-09-26
Code
6
3-layer AWD-LSTM
1.232
No
An Analysis of Neural Language Modeling at Multi...
2018-03-22
Code
7
Mogrifier LSTM
1.122
No
Mogrifier LSTM
2019-09-04
Code
8
12-layer Character Transformer Model
1.11
No
Character-Level Language Modeling with Deeper Se...
2018-08-09
Code
9
mLSTM + dynamic eval
1.08
No
Dynamic Evaluation of Neural Sequence Models
2017-09-21
Code
10
64-layer Character Transformer Model
1.06
No
Character-Level Language Modeling with Deeper Se...
2018-08-09
Code
11
12-layer Transformer-XL
1.06
Yes
Transformer-XL: Attentive Language Models Beyond...
2019-01-09
Code
12
18-layer Transformer-XL
1.03
Yes
Transformer-XL: Attentive Language Models Beyond...
2019-01-09
Code
13
Longformer Small
1
No
Longformer: The Long-Document Transformer
2020-04-10
Code
14
24-layer Transformer-XL
0.99
No
Transformer-XL: Attentive Language Models Beyond...
2019-01-09
Code
15
Longformer Large
0.99
No
Longformer: The Long-Document Transformer
2020-04-10
Code
16
Mogrifier LSTM + dynamic eval
0.988
No
Mogrifier LSTM
2019-09-04
Code
17
Compressive Transformer
0.97
No
Compressive Transformers for Long-Range Sequence...
2019-11-13
Code
18
Transformer-XL + RMS dynamic eval
0.94
No
Dynamic Evaluation of Transformer Language Models
2019-04-17
Code