Metric: BLEU (higher is better)
| # | Model↕ | BLEU▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | GPT-3 175B (Few-Shot) | 29.7 | No | Language Models are Few-Shot Learners | 2020-05-28 | Code |
| 2 | MASS (6-layer Transformer) | 28.3 | No | MASS: Masked Sequence to Sequence Pre-training f... | 2019-05-07 | Code |
| 3 | SMT + NMT (tuning and joint refinement) | 26.9 | No | An Effective Approach to Unsupervised Machine Tr... | 2019-02-04 | Code |
| 4 | MLM pretraining for encoder and decoder | 26.4 | No | Cross-lingual Language Model Pretraining | 2019-01-22 | Code |
| 5 | SMT as posterior regularization | 21.7 | No | Unsupervised Neural Machine Translation with SMT... | 2019-01-14 | Code |
| 6 | PBSMT + NMT | 20.2 | No | Phrase-Based & Neural Unsupervised Machine Trans... | 2018-04-20 | Code |
| 7 | Synthetic bilingual data init | 20 | No | Unsupervised Neural Machine Translation Initiali... | 2018-10-30 | - |