TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Image Captioning/nocaps entire

Image Captioning on nocaps entire

Metric: B2 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕B2▼Extra DataPaperDate↕Code
1GIT, Single Model74.81NoGIT: A Generative Image-to-text Transformer for ...2022-05-27Code
2CoCa - Google Brain73.71No---
3Microsoft Cognitive Services team71.36NoScaling Up Vision-Language Pre-training for Imag...2021-11-24-
4Prismer69.99NoPrismer: A Vision-Language Model with Multi-Task...2023-03-04Code
5Single Model68.86NoSimVLM: Simple Visual Language Model Pretraining...2021-08-24Code
6FudanFVL68.77No---
7FudanWYZ67.45No---
8IEDA-LAB67.3No---
9MD66.25No---
10firethehole65.55No---
11VinVL (Microsoft Cognitive Services + MSR)65.15NoVinVL: Revisiting Visual Representations in Visi...2021-01-02Code
12vll@mk51465.1No---
13ViTCAP-CIDEr-136.7-ENC-DEC-ViTbfocal10-test-CBS64.62No---
14icgp2ssi1_coco_si_0.02_5_test61.95No---
15evertyhing61.6No---
16vinvl_yuan_cbs60.95No---
17Oscar60.83No---
18RCAL60.74No---
19camel XE60.27No---
20cxy_nocaps_training59.36No---
21Xinyi59.05No---
22MQ-UpDown-C57.76No---
23UpDown + ELMo + CBS56.74No---
24Human56.46No---
25nocaps_training55.11No---
26UpDown55.11No---
27B254.08No---
287_10-7_40000_predict_test.json52.88No---
29YX52.52No---
30Neural Baby Talk52.42No---
31Neural Baby Talk + CBS52.12No---
32None52.04No---
33area_attention51.97No---
34coco_all_1948.95No---
35CS395T47.65No---
36Yu-Wu47.37No---