TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Image Captioning/nocaps entire

Image Captioning on nocaps entire

Metric: B4 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕B4▼Extra DataPaperDate↕Code
1CoCa - Google Brain37.71No---
2GIT, Single Model37.35NoGIT: A Generative Image-to-text Transformer for ...2022-05-27Code
3Microsoft Cognitive Services team34.65NoScaling Up Vision-Language Pre-training for Imag...2021-11-24-
4Prismer33.66NoPrismer: A Vision-Language Model with Multi-Task...2023-03-04Code
5Single Model32.2NoSimVLM: Simple Visual Language Model Pretraining...2021-08-24Code
6FudanFVL32.17No---
7FudanWYZ31.38No---
8firethehole30.2No---
9IEDA-LAB29.27No---
10MD28.2No---
11vll@mk51427.32No---
12ViTCAP-CIDEr-136.7-ENC-DEC-ViTbfocal10-test-CBS26.52No---
13VinVL (Microsoft Cognitive Services + MSR)26.15NoVinVL: Revisiting Visual Representations in Visi...2021-01-02Code
14icgp2ssi1_coco_si_0.02_5_test24.62No---
15evertyhing23.52No---
16camel XE23.48No---
17RCAL21.24No---
18Oscar21.02No---
19vinvl_yuan_cbs20.3No---
20MQ-UpDown-C20.11No---
21cxy_nocaps_training19.72No---
22Human19.48No---
23Xinyi19.43No---
24nocaps_training19.16No---
25UpDown19.16No---
26UpDown + ELMo + CBS18.41No---
277_10-7_40000_predict_test.json17.75No---
28B217.69No---
29None16.73No---
30area_attention16.48No---
31YX16.31No---
32coco_all_1915.02No---
33Neural Baby Talk14.73No---
34Neural Baby Talk + CBS12.88No---
35Yu-Wu11.96No---
36CS395T11.72No---