Language Modelling on One Billion Word

Metric: Validation perplexity (lower is better)

LeaderboardDataset

Loading chart...

Results

Submit a result

#	Model↕	Validation perplexity▲	Extra Data	Paper	Date↕	Code
1	H-Transformer-1D Nr=16 (Large)	20.25	No	H-Transformer-1D: Fast One-Dimensional Hierarchi...	2021-07-25	Code
2	Adaptive Input Very Large	22.92	No	Adaptive Input Representations for Neural Langua...	2018-09-28	Code
3	Adaptive Input Large	23.83	No	Adaptive Input Representations for Neural Langua...	2018-09-28	Code
4	H-Transformer-1D Nr=16 (Base)	23.95	No	H-Transformer-1D: Fast One-Dimensional Hierarchi...	2021-07-25	Code