Metric: BLEU (EN-DE) (higher is better)
| # | Model↕ | BLEU (EN-DE)▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | ERNIE-UniX2 | 49.3 | No | ERNIE-UniX2: A Unified Cross-lingual Cross-modal... | 2022-11-09 | - |
| 2 | IKD-MMT | 41.28 | No | Distill the Image to Nowhere: Inversion Knowledg... | 2022-10-10 | Code |
| 3 | DCCN | 39.7 | No | Dynamic Context-guided Capsule Network for Multi... | 2020-09-04 | Code |
| 4 | Caglayan | 39.4 | No | Multimodal Machine Translation through Visuals a... | 2019-11-28 | - |
| 5 | Gumbel-Attention MMT | 39.2 | No | Gumbel-Attention for Multi-modal Machine Transla... | 2021-03-16 | - |
| 6 | Multimodal Transformer | 38.7 | No | - | - | Code |
| 7 | ImagiT | 38.4 | No | Generative Imagination Elevates Machine Translat... | 2020-09-21 | - |
| 8 | del+obj | 38 | No | Distilling Translations with Visual Awareness | 2019-06-18 | Code |
| 9 | VMMTF | 37.6 | No | Latent Variable Model for Multi-modal Translation | 2018-11-01 | Code |
| 10 | IMGD | 37.3 | No | Incorporating Global Visual Features into Attent... | 2017-01-23 | - |
| 11 | NMTSRC+IMG | 37.1 | No | Doubly-Attentive Decoder for Multi-modal Neural ... | 2017-02-04 | - |
| 12 | VAG-NMT | 31.6 | No | A Visual Attention Grounding Neural Model for Mu... | 2018-08-24 | Code |