Metric: BLEU (higher is better)
| # | Model↕ | BLEU▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | GPT-3 175B (Few-Shot) | 40.6 | No | Language Models are Few-Shot Learners | 2020-05-28 | Code |
| 2 | MASS (6-layer Transformer) | 35.2 | No | MASS: Masked Sequence to Sequence Pre-training f... | 2019-05-07 | Code |
| 3 | SMT + NMT (tuning and joint refinement) | 34.4 | No | An Effective Approach to Unsupervised Machine Tr... | 2019-02-04 | Code |
| 4 | MLM pretraining for encoder and decoder | 34.3 | No | Cross-lingual Language Model Pretraining | 2019-01-22 | Code |
| 5 | Synthetic bilingual data init | 26.7 | No | Unsupervised Neural Machine Translation Initiali... | 2018-10-30 | - |
| 6 | SMT as posterior regularization | 26.3 | No | Unsupervised Neural Machine Translation with SMT... | 2019-01-14 | Code |
| 7 | PBSMT | 25.2 | No | Phrase-Based & Neural Unsupervised Machine Trans... | 2018-04-20 | Code |