Metric: Accuracy (higher is better)
| # | Model↕ | Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | ByT5 XXL | 83.7 | No | ByT5: Towards a token-free future with pre-train... | 2021-05-28 | Code |
| 2 | Decoupled | 71.3 | No | Rethinking embedding coupling in pre-trained lan... | 2020-10-24 | Code |
| 3 | Coupled | 70.7 | No | Rethinking embedding coupling in pre-trained lan... | 2020-10-24 | Code |
| 4 | ByT5 Small | 69.1 | No | ByT5: Towards a token-free future with pre-train... | 2021-05-28 | Code |
| 5 | mGPT | 40.6 | No | mGPT: Few-Shot Learners Go Multilingual | 2022-04-15 | Code |