Metric: Validation perplexity (lower is better)
| # | Model↕ | Validation perplexity▲ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | H-Transformer-1D Nr=16 (Large) | 20.25 | No | H-Transformer-1D: Fast One-Dimensional Hierarchi... | 2021-07-25 | Code |
| 2 | Adaptive Input Very Large | 22.92 | No | Adaptive Input Representations for Neural Langua... | 2018-09-28 | Code |
| 3 | Adaptive Input Large | 23.83 | No | Adaptive Input Representations for Neural Langua... | 2018-09-28 | Code |
| 4 | H-Transformer-1D Nr=16 (Base) | 23.95 | No | H-Transformer-1D: Fast One-Dimensional Hierarchi... | 2021-07-25 | Code |