TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Image Captioning/nocaps entire

Image Captioning on nocaps entire

Metric: B1 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕B1▼Extra DataPaperDate↕Code
1GIT, Single Model88.1NoGIT: A Generative Image-to-text Transformer for ...2022-05-27Code
2CoCa - Google Brain87.01No---
3Microsoft Cognitive Services team85.62NoScaling Up Vision-Language Pre-training for Imag...2021-11-24-
4Prismer84.87NoPrismer: A Vision-Language Model with Multi-Task...2023-03-04Code
5FudanFVL83.9No---
6Single Model83.78NoSimVLM: Simple Visual Language Model Pretraining...2021-08-24Code
7IEDA-LAB83.25No---
8FudanWYZ82.95No---
9MD82.43No---
10vll@mk51481.61No---
11VinVL (Microsoft Cognitive Services + MSR)81.59NoVinVL: Revisiting Visual Representations in Visi...2021-01-02Code
12ViTCAP-CIDEr-136.7-ENC-DEC-ViTbfocal10-test-CBS81.03No---
13firethehole80.77No---
14Oscar79.57No---
15vinvl_yuan_cbs79.32No---
16icgp2ssi1_coco_si_0.02_5_test79No---
17evertyhing78.92No---
18cxy_nocaps_training78.75No---
19Xinyi78.58No---
20RCAL78.19No---
21camel XE77.97No---
22MQ-UpDown-C76.89No---
23Human76.64No---
24UpDown + ELMo + CBS76.59No---
25nocaps_training74No---
26UpDown74No---
27Neural Baby Talk + CBS73.42No---
28B273.04No---
29YX72.78No---
307_10-7_40000_predict_test.json72.49No---
31Neural Baby Talk72.33No---
32area_attention72.02No---
33None71.69No---
34coco_all_1969.44No---
35CS395T69.07No---
36Yu-Wu67.85No---