作者给的test文件

Reported on 8 benchmarks across 1 task

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing8 results

Image Captioningonnocaps in-domain
B1
81.64
best: 88.86 (GIT2, Single Model)
Image Captioningonnocaps in-domain
B2
63.79
best: 76.1 (GIT, Single Model)
Image Captioningonnocaps in-domain
B3
43.43
best: 60.53 (GIT, Single Model)
Image Captioningonnocaps in-domain
B4
25.15
best: 41.65 (GIT, Single Model)
Image Captioningonnocaps in-domain
CIDEr
85.81
best: 149.1 (PaLI)
Image Captioningonnocaps in-domain
METEOR
27.25
best: 34.22 (PaLI)
Image Captioningonnocaps in-domain
ROUGE-L
55.06
best: 64.39 (PaLI)
Image Captioningonnocaps in-domain
SPICE
12.35
best: 16.36 (GIT2, Single Model)