Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Medical
/
Language Modelling
/
Penn Treebank (Character Level)
Language Modelling on Penn Treebank (Character Level)
Metric: Bit per Character (BPC) (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
Bit per Character (BPC) (best first)
Bit per Character (BPC) (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Bit per Character (BPC)
▼
Extra Data
Paper
Date
↕
Code
1
Bipartite Flow
1.38
No
Discrete Flows: Invertible Generative Models of ...
2019-05-24
Code
2
TCN
1.31
No
Seq-U-Net: A One-Dimensional Causal U-Net for Ef...
2019-11-14
Code
3
Temporal Convolutional Network
1.31
No
An Empirical Evaluation of Generic Convolutional...
2018-03-04
Code
4
Seq-U-Net
1.3
No
Seq-U-Net: A One-Dimensional Causal U-Net for Ef...
2019-11-14
Code
5
STAR
1.3
No
Gating Revisited: Deep Multi-layer RNNs That Can...
2019-11-25
Code
6
R-Transformer
1.24
No
R-Transformer: Recurrent Neural Network Enhanced...
2019-07-12
Code
7
2-layer Norm HyperLSTM
1.219
No
HyperNetworks
2016-09-27
Code
8
NAS-RL
1.214
No
Neural Architecture Search with Reinforcement Le...
2016-11-05
Code
9
FS-LSTM-2
1.193
No
Fast-Slow Recurrent Neural Networks
2017-05-24
Code
10
FS-LSTM-4
1.19
No
Fast-Slow Recurrent Neural Networks
2017-05-24
Code
11
IndRNN
1.19
No
Independently Recurrent Neural Network (IndRNN):...
2018-03-13
Code
12
6-layer QRNN
1.187
No
An Analysis of Neural Language Modeling at Multi...
2018-03-22
Code
13
Dense IndRNN
1.18
No
Deep Independently Recurrent Neural Network (Ind...
2019-10-11
Code
14
3-layer AWD-LSTM
1.175
No
An Analysis of Neural Language Modeling at Multi...
2018-03-22
Code
15
Past Decode Reg. + AWD-LSTM-MoS + dyn. eval.
1.169
No
Improved Language Modeling by Decoding the Past
2018-08-14
-
16
Feedback Transformer
1.16
No
Addressing Some Limitations of Transformers with...
2020-02-21
Code
17
Trellis Network
1.158
No
Trellis Networks for Sequence Modeling
2018-10-15
Code
18
GAM-RHN-5
1.147
No
-
-
Code
19
Mogrifier LSTM
1.12
No
Mogrifier LSTM
2019-09-04
Code
20
Mogrifier LSTM + dynamic eval
1.083
No
Mogrifier LSTM
2019-09-04
Code
#1
Bipartite Flow
SOTA
1.38
Bit per Character (BPC)
· 2019-05-24
Discrete Flows: Invertible Generative Models of Discrete Data
Code
#2
TCN
1.31
Bit per Character (BPC)
· 2019-11-14
Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
Code
#3
Temporal Convolutional Network
SOTA
1.31
Bit per Character (BPC)
· 2018-03-04
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
Code
#4
Seq-U-Net
1.3
Bit per Character (BPC)
· 2019-11-14
Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
Code
#5
STAR
1.3
Bit per Character (BPC)
· 2019-11-25
Gating Revisited: Deep Multi-layer RNNs That Can Be Trained
Code
#6
R-Transformer
1.24
Bit per Character (BPC)
· 2019-07-12
R-Transformer: Recurrent Neural Network Enhanced Transformer
Code
#7
2-layer Norm HyperLSTM
SOTA
1.219
Bit per Character (BPC)
· 2016-09-27
HyperNetworks
Code
#8
NAS-RL
1.214
Bit per Character (BPC)
· 2016-11-05
Neural Architecture Search with Reinforcement Learning
Code
#9
FS-LSTM-2
1.193
Bit per Character (BPC)
· 2017-05-24
Fast-Slow Recurrent Neural Networks
Code
#10
FS-LSTM-4
1.19
Bit per Character (BPC)
· 2017-05-24
Fast-Slow Recurrent Neural Networks
Code
#11
IndRNN
1.19
Bit per Character (BPC)
· 2018-03-13
Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN
Code
#12
6-layer QRNN
1.187
Bit per Character (BPC)
· 2018-03-22
An Analysis of Neural Language Modeling at Multiple Scales
Code
#13
Dense IndRNN
1.18
Bit per Character (BPC)
· 2019-10-11
Deep Independently Recurrent Neural Network (IndRNN)
Code
#14
3-layer AWD-LSTM
1.175
Bit per Character (BPC)
· 2018-03-22
An Analysis of Neural Language Modeling at Multiple Scales
Code
#15
Past Decode Reg. + AWD-LSTM-MoS + dyn. eval.
1.169
Bit per Character (BPC)
· 2018-08-14
Improved Language Modeling by Decoding the Past
#16
Feedback Transformer
1.16
Bit per Character (BPC)
· 2020-02-21
Addressing Some Limitations of Transformers with Feedback Memory
Code
#17
Trellis Network
1.158
Bit per Character (BPC)
· 2018-10-15
Trellis Networks for Sequence Modeling
Code
#18
GAM-RHN-5
1.147
Bit per Character (BPC)
No paper
Code
#19
Mogrifier LSTM
1.12
Bit per Character (BPC)
· 2019-09-04
Mogrifier LSTM
Code
#20
Mogrifier LSTM + dynamic eval
1.083
Bit per Character (BPC)
· 2019-09-04
Mogrifier LSTM
Code