Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Image Captioning
/
nocaps near-domain
Image Captioning on nocaps near-domain
Metric: CIDEr (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
CIDEr (best first)
CIDEr (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
CIDEr
▼
Extra Data
Paper
Date
↕
Code
1
GIT2, Single Model
125.51
No
GIT: A Generative Image-to-text Transformer for ...
2022-05-27
Code
2
PaLI
124.35
No
PaLI: A Jointly-Scaled Multilingual Language-Ima...
2022-09-14
Code
3
GIT, Single Model
123.92
No
GIT: A Generative Image-to-text Transformer for ...
2022-05-27
Code
4
CoCa - Google Brain
120.73
No
-
-
-
5
Microsoft Cognitive Services team
115.54
No
VIVO: Visual Vocabulary Pre-Training for Novel O...
2020-09-28
-
6
Single Model
110.76
No
SimVLM: Simple Visual Language Model Pretraining...
2021-08-24
Code
7
FudanFVL
109.33
No
-
-
-
8
FudanWYZ
108.04
No
-
-
-
9
IEDA-LAB
100.15
No
-
-
-
10
firethehole
99.51
No
-
-
-
11
MD
95.73
No
-
-
-
12
vll@mk514
95.69
No
-
-
-
13
VinVL (Microsoft Cognitive Services + MSR)
95.16
No
VinVL: Revisiting Visual Representations in Visi...
2021-01-02
Code
14
ViTCAP-CIDEr-136.7-ENC-DEC-ViTbfocal10-test-CBS
89.87
No
-
-
-
15
icgp2ssi1_coco_si_0.02_5_test
87.41
No
-
-
-
16
evertyhing
85.89
No
-
-
-
17
Human
84.58
No
-
-
-
18
RCAL
84
No
-
-
-
19
Oscar
82.07
No
-
-
-
20
vinvl_yuan_cbs
80.21
No
-
-
-
21
cxy_nocaps_training
79.72
No
-
-
-
22
Xinyi
79.44
No
-
-
-
23
camel XE
79.14
No
-
-
-
24
MQ-UpDown-C
76.34
No
-
-
-
25
UpDown + ELMo + CBS
74.2
No
-
-
-
26
ClipCap (MLP + GPT2 tuning)
67.69
No
ClipCap: CLIP Prefix for Image Captioning
2021-11-18
Code
27
ClipCap (Transformer)
66.82
No
ClipCap: CLIP Prefix for Image Captioning
2021-11-18
Code
28
7_10-7_40000_predict_test.json
63.96
No
-
-
-
29
Neural Baby Talk + CBS
61.98
No
-
-
-
30
None
58.5
No
-
-
-
31
nocaps_training
56.85
No
-
-
-
32
UpDown
56.85
No
-
-
-
33
Neural Baby Talk
53.21
No
-
-
-
34
YX
51.16
No
-
-
-
35
area_attention
50.34
No
-
-
-
36
B2
49.62
No
-
-
-
37
coco_all_19
47.53
No
-
-
-
38
Yu-Wu
46.64
No
-
-
-
39
CS395T
40.45
No
-
-
-
#1
GIT2, Single Model
SOTA
125.51
CIDEr
· 2022-05-27
GIT: A Generative Image-to-text Transformer for Vision and Language
Code
#2
PaLI
124.35
CIDEr
· 2022-09-14
PaLI: A Jointly-Scaled Multilingual Language-Image Model
Code
#3
GIT, Single Model
123.92
CIDEr
· 2022-05-27
GIT: A Generative Image-to-text Transformer for Vision and Language
Code
#4
CoCa - Google Brain
120.73
CIDEr
No paper
#5
Microsoft Cognitive Services team
SOTA
115.54
CIDEr
· 2020-09-28
VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning
#6
Single Model
110.76
CIDEr
· 2021-08-24
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
Code
#7
FudanFVL
109.33
CIDEr
No paper
#8
FudanWYZ
108.04
CIDEr
No paper
#9
IEDA-LAB
100.15
CIDEr
No paper
#10
firethehole
99.51
CIDEr
No paper
#11
MD
95.73
CIDEr
No paper
#12
vll@mk514
95.69
CIDEr
No paper
#13
VinVL (Microsoft Cognitive Services + MSR)
95.16
CIDEr
· 2021-01-02
VinVL: Revisiting Visual Representations in Vision-Language Models
Code
#14
ViTCAP-CIDEr-136.7-ENC-DEC-ViTbfocal10-test-CBS
89.87
CIDEr
No paper
#15
icgp2ssi1_coco_si_0.02_5_test
87.41
CIDEr
No paper
#16
evertyhing
85.89
CIDEr
No paper
#17
Human
84.58
CIDEr
No paper
#18
RCAL
84
CIDEr
No paper
#19
Oscar
82.07
CIDEr
No paper
#20
vinvl_yuan_cbs
80.21
CIDEr
No paper
#21
cxy_nocaps_training
79.72
CIDEr
No paper
#22
Xinyi
79.44
CIDEr
No paper
#23
camel XE
79.14
CIDEr
No paper
#24
MQ-UpDown-C
76.34
CIDEr
No paper
#25
UpDown + ELMo + CBS
74.2
CIDEr
No paper
#26
ClipCap (MLP + GPT2 tuning)
67.69
CIDEr
· 2021-11-18
ClipCap: CLIP Prefix for Image Captioning
Code
#27
ClipCap (Transformer)
66.82
CIDEr
· 2021-11-18
ClipCap: CLIP Prefix for Image Captioning
Code
#28
7_10-7_40000_predict_test.json
63.96
CIDEr
No paper
#29
Neural Baby Talk + CBS
61.98
CIDEr
No paper
#30
None
58.5
CIDEr
No paper
#31
nocaps_training
56.85
CIDEr
No paper
#32
UpDown
56.85
CIDEr
No paper
#33
Neural Baby Talk
53.21
CIDEr
No paper
#34
YX
51.16
CIDEr
No paper
#35
area_attention
50.34
CIDEr
No paper
#36
B2
49.62
CIDEr
No paper
#37
coco_all_19
47.53
CIDEr
No paper
#38
Yu-Wu
46.64
CIDEr
No paper
#39
CS395T
40.45
CIDEr
No paper