TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Image Captioning/nocaps entire

Image Captioning on nocaps entire

Metric: SPICE (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕SPICE▼Extra DataPaperDate↕Code
1GIT, Single Model15.94NoGIT: A Generative Image-to-text Transformer for ...2022-05-27Code
2CoCa - Google Brain15.47No---
3Prismer14.91NoPrismer: A Vision-Language Model with Multi-Task...2023-03-04Code
4Microsoft Cognitive Services team14.85NoScaling Up Vision-Language Pre-training for Imag...2021-11-24-
5firethehole14.74No---
6FudanFVL14.72No---
7Human14.67No---
8FudanWYZ14.56No---
9Single Model14.49NoSimVLM: Simple Visual Language Model Pretraining...2021-08-24Code
10vll@mk51414.06No---
11IEDA-LAB13.9No---
12MD13.35No---
13VinVL (Microsoft Cognitive Services + MSR)13.07NoVinVL: Revisiting Visual Representations in Visi...2021-01-02Code
14ViTCAP-CIDEr-136.7-ENC-DEC-ViTbfocal10-test-CBS12.81No---
15RCAL12.2No---
16evertyhing12.1No---
17icgp2ssi1_coco_si_0.02_5_test12.01No---
18vinvl_yuan_cbs11.9No---
19camel XE11.89No---
20MQ-UpDown-C11.68No---
21Xinyi11.62No---
22cxy_nocaps_training11.57No---
23Oscar11.29No---
24UpDown + ELMo + CBS11.2No---
25ClipCap (MLP + GPT2 tuning)11.1NoClipCap: CLIP Prefix for Image Captioning2021-11-18Code
267_10-7_40000_predict_test.json10.96No---
27ClipCap (Transformer)10.86NoClipCap: CLIP Prefix for Image Captioning2021-11-18Code
28nocaps_training10.14No---
29UpDown10.14No---
30None10.1No---
31Neural Baby Talk + CBS9.69No---
32area_attention9.56No---
33YX9.54No---
34B29.42No---
35Neural Baby Talk9.15No---
36coco_all_199.13No---
37Yu-Wu8.35No---
38CS395T8.2No---