Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Image Captioning
/
nocaps near-domain
Image Captioning on nocaps near-domain
Metric: SPICE (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
SPICE (best first)
SPICE (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
SPICE
▼
Extra Data
Paper
Date
↕
Code
1
GIT2, Single Model
16.11
No
GIT: A Generative Image-to-text Transformer for ...
2022-05-27
Code
2
GIT, Single Model
15.96
No
GIT: A Generative Image-to-text Transformer for ...
2022-05-27
Code
3
PaLI
15.75
No
PaLI: A Jointly-Scaled Multilingual Language-Ima...
2022-09-14
Code
4
PaLI
15.75
No
PaLI: A Jointly-Scaled Multilingual Language-Ima...
2022-09-14
Code
5
CoCa - Google Brain
15.54
No
-
-
-
6
Microsoft Cognitive Services team
15.06
No
VIVO: Visual Vocabulary Pre-Training for Novel O...
2020-09-28
-
7
firethehole
14.88
No
-
-
-
8
FudanFVL
14.79
No
-
-
-
9
Human
14.72
No
-
-
-
10
FudanWYZ
14.71
No
-
-
-
11
Single Model
14.61
No
SimVLM: Simple Visual Language Model Pretraining...
2021-08-24
Code
12
vll@mk514
14.37
No
-
-
-
13
IEDA-LAB
14.15
No
-
-
-
14
MD
13.64
No
-
-
-
15
VinVL (Microsoft Cognitive Services + MSR)
13.36
No
VinVL: Revisiting Visual Representations in Visi...
2021-01-02
Code
16
ViTCAP-CIDEr-136.7-ENC-DEC-ViTbfocal10-test-CBS
12.98
No
-
-
-
17
RCAL
12.47
No
-
-
-
18
evertyhing
12.24
No
-
-
-
19
camel XE
12.14
No
-
-
-
20
vinvl_yuan_cbs
12.12
No
-
-
-
21
icgp2ssi1_coco_si_0.02_5_test
12.11
No
-
-
-
22
Xinyi
11.88
No
-
-
-
23
MQ-UpDown-C
11.87
No
-
-
-
24
cxy_nocaps_training
11.81
No
-
-
-
25
Oscar
11.53
No
-
-
-
26
UpDown + ELMo + CBS
11.45
No
-
-
-
27
ClipCap (MLP + GPT2 tuning)
11.26
No
ClipCap: CLIP Prefix for Image Captioning
2021-11-18
Code
28
7_10-7_40000_predict_test.json
11.14
No
-
-
-
29
ClipCap (Transformer)
10.92
No
ClipCap: CLIP Prefix for Image Captioning
2021-11-18
Code
30
nocaps_training
10.33
No
-
-
-
31
UpDown
10.33
No
-
-
-
32
None
10.28
No
-
-
-
33
Neural Baby Talk + CBS
9.83
No
-
-
-
34
YX
9.7
No
-
-
-
35
area_attention
9.7
No
-
-
-
36
B2
9.54
No
-
-
-
37
coco_all_19
9.28
No
-
-
-
38
Neural Baby Talk
9.26
No
-
-
-
39
Yu-Wu
8.37
No
-
-
-
40
CS395T
8.28
No
-
-
-
#1
GIT2, Single Model
SOTA
16.11
SPICE
· 2022-05-27
GIT: A Generative Image-to-text Transformer for Vision and Language
Code
#2
GIT, Single Model
15.96
SPICE
· 2022-05-27
GIT: A Generative Image-to-text Transformer for Vision and Language
Code
#3
PaLI
15.75
SPICE
· 2022-09-14
PaLI: A Jointly-Scaled Multilingual Language-Image Model
Code
#4
PaLI
15.75
SPICE
· 2022-09-14
PaLI: A Jointly-Scaled Multilingual Language-Image Model
Code
#5
CoCa - Google Brain
15.54
SPICE
No paper
#6
Microsoft Cognitive Services team
SOTA
15.06
SPICE
· 2020-09-28
VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning
#7
firethehole
14.88
SPICE
No paper
#8
FudanFVL
14.79
SPICE
No paper
#9
Human
14.72
SPICE
No paper
#10
FudanWYZ
14.71
SPICE
No paper
#11
Single Model
14.61
SPICE
· 2021-08-24
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
Code
#12
vll@mk514
14.37
SPICE
No paper
#13
IEDA-LAB
14.15
SPICE
No paper
#14
MD
13.64
SPICE
No paper
#15
VinVL (Microsoft Cognitive Services + MSR)
13.36
SPICE
· 2021-01-02
VinVL: Revisiting Visual Representations in Vision-Language Models
Code
#16
ViTCAP-CIDEr-136.7-ENC-DEC-ViTbfocal10-test-CBS
12.98
SPICE
No paper
#17
RCAL
12.47
SPICE
No paper
#18
evertyhing
12.24
SPICE
No paper
#19
camel XE
12.14
SPICE
No paper
#20
vinvl_yuan_cbs
12.12
SPICE
No paper
#21
icgp2ssi1_coco_si_0.02_5_test
12.11
SPICE
No paper
#22
Xinyi
11.88
SPICE
No paper
#23
MQ-UpDown-C
11.87
SPICE
No paper
#24
cxy_nocaps_training
11.81
SPICE
No paper
#25
Oscar
11.53
SPICE
No paper
#26
UpDown + ELMo + CBS
11.45
SPICE
No paper
#27
ClipCap (MLP + GPT2 tuning)
11.26
SPICE
· 2021-11-18
ClipCap: CLIP Prefix for Image Captioning
Code
#28
7_10-7_40000_predict_test.json
11.14
SPICE
No paper
#29
ClipCap (Transformer)
10.92
SPICE
· 2021-11-18
ClipCap: CLIP Prefix for Image Captioning
Code
#30
nocaps_training
10.33
SPICE
No paper
#31
UpDown
10.33
SPICE
No paper
#32
None
10.28
SPICE
No paper
#33
Neural Baby Talk + CBS
9.83
SPICE
No paper
#34
YX
9.7
SPICE
No paper
#35
area_attention
9.7
SPICE
No paper
#36
B2
9.54
SPICE
No paper
#37
coco_all_19
9.28
SPICE
No paper
#38
Neural Baby Talk
9.26
SPICE
No paper
#39
Yu-Wu
8.37
SPICE
No paper
#40
CS395T
8.28
SPICE
No paper