TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Image Captioning/nocaps entire

Image Captioning on nocaps entire

Metric: B3 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕B3▼Extra DataPaperDate↕Code
1GIT, Single Model57.68NoGIT: A Generative Image-to-text Transformer for ...2022-05-27Code
2CoCa - Google Brain56.88No---
3Microsoft Cognitive Services team53.62NoScaling Up Vision-Language Pre-training for Imag...2021-11-24-
4Prismer52.48NoPrismer: A Vision-Language Model with Multi-Task...2023-03-04Code
5Single Model51.06NoSimVLM: Simple Visual Language Model Pretraining...2021-08-24Code
6FudanFVL50.84No---
7FudanWYZ49.58No---
8IEDA-LAB48.41No---
9firethehole48.14No---
10MD47.18No---
11vll@mk51446.13No---
12ViTCAP-CIDEr-136.7-ENC-DEC-ViTbfocal10-test-CBS45.26No---
13VinVL (Microsoft Cognitive Services + MSR)45.04NoVinVL: Revisiting Visual Representations in Visi...2021-01-02Code
14icgp2ssi1_coco_si_0.02_5_test42.36No---
15evertyhing41.52No---
16camel XE40.68No---
17vinvl_yuan_cbs39.5No---
18RCAL39.11No---
19Oscar38.83No---
20cxy_nocaps_training37.56No---
21Xinyi37.39No---
22MQ-UpDown-C36.93No---
23Human36.37No---
24UpDown + ELMo + CBS35.39No---
25nocaps_training35.23No---
26UpDown35.23No---
27B233.88No---
287_10-7_40000_predict_test.json33.22No---
29YX31.74No---
30None31.7No---
31area_attention31.62No---
32Neural Baby Talk30.83No---
33Neural Baby Talk + CBS29.35No---
34coco_all_1928.64No---
35Yu-Wu25.76No---
36CS395T25.5No---