Metric: METEOR (higher is better)
| # | Model↕ | METEOR▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | GIT, Single Model | 32.5 | No | GIT: A Generative Image-to-text Transformer for ... | 2022-05-27 | Code |
| 2 | CoCa - Google Brain | 32.29 | No | - | - | - |
| 3 | Microsoft Cognitive Services team | 31.27 | No | Scaling Up Vision-Language Pre-training for Imag... | 2021-11-24 | - |
| 4 | Prismer | 31.13 | No | Prismer: A Vision-Language Model with Multi-Task... | 2023-03-04 | Code |
| 5 | FudanFVL | 30.64 | No | - | - | - |
| 6 | Single Model | 30.55 | No | SimVLM: Simple Visual Language Model Pretraining... | 2021-08-24 | Code |
| 7 | FudanWYZ | 30.32 | No | - | - | - |
| 8 | firethehole | 30.07 | No | - | - | - |
| 9 | IEDA-LAB | 28.92 | No | - | - | - |
| 10 | vll@mk514 | 28.46 | No | - | - | - |
| 11 | Human | 28.15 | No | - | - | - |
| 12 | MD | 28.09 | No | - | - | - |
| 13 | VinVL (Microsoft Cognitive Services + MSR) | 27.57 | No | VinVL: Revisiting Visual Representations in Visi... | 2021-01-02 | Code |
| 14 | ViTCAP-CIDEr-136.7-ENC-DEC-ViTbfocal10-test-CBS | 27.36 | No | - | - | - |
| 15 | evertyhing | 26.31 | No | - | - | - |
| 16 | icgp2ssi1_coco_si_0.02_5_test | 26.29 | No | - | - | - |
| 17 | camel XE | 26.15 | No | - | - | - |
| 18 | RCAL | 25.72 | No | - | - | - |
| 19 | vinvl_yuan_cbs | 25.44 | No | - | - | - |
| 20 | Oscar | 25.33 | No | - | - | - |
| 21 | MQ-UpDown-C | 25.18 | No | - | - | - |
| 22 | cxy_nocaps_training | 25.13 | No | - | - | - |
| 23 | Xinyi | 25.12 | No | - | - | - |
| 24 | UpDown + ELMo + CBS | 24.42 | No | - | - | - |
| 25 | 7_10-7_40000_predict_test.json | 23.89 | No | - | - | - |
| 26 | nocaps_training | 22.96 | No | - | - | - |
| 27 | UpDown | 22.96 | No | - | - | - |
| 28 | None | 22.53 | No | - | - | - |
| 29 | Neural Baby Talk + CBS | 22.06 | No | - | - | - |
| 30 | area_attention | 21.87 | No | - | - | - |
| 31 | B2 | 21.85 | No | - | - | - |
| 32 | YX | 21.72 | No | - | - | - |
| 33 | Neural Baby Talk | 21.52 | No | - | - | - |
| 34 | coco_all_19 | 20.77 | No | - | - | - |
| 35 | Yu-Wu | 19.84 | No | - | - | - |
| 36 | CS395T | 19.61 | No | - | - | - |