Metric: EM (higher is better)
| # | Model↕ | EM▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | ByT5 (fine-tuned) | 81.9 | No | ByT5: Towards a token-free future with pre-train... | 2021-05-28 | Code |
| 2 | U-PaLM 62B (fine-tuned) | 78.4 | No | Transcending Scaling Laws with 0.1% Extra Compute | 2022-10-20 | - |
| 3 | Flan-U-PaLM 540B (direct-prompting) | 68.3 | No | Scaling Instruction-Finetuned Language Models | 2022-10-20 | Code |
| 4 | Flan-PaLM 540B (direct-prompting) | 67.8 | No | Scaling Instruction-Finetuned Language Models | 2022-10-20 | Code |
| 5 | ByT5 XXL | 60 | No | ByT5: Towards a token-free future with pre-train... | 2021-05-28 | Code |
| 6 | U-PaLM-540B (CoT) | 54.6 | No | Transcending Scaling Laws with 0.1% Extra Compute | 2022-10-20 | - |
| 7 | PaLM-540B (CoT) | 52.9 | No | PaLM: Scaling Language Modeling with Pathways | 2022-04-05 | Code |
| 8 | Decoupled | 42.8 | No | Rethinking embedding coupling in pre-trained lan... | 2020-10-24 | Code |