TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Image Captioning/nocaps entire

Image Captioning on nocaps entire

Metric: ROUGE-L (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕ROUGE-L▼Extra DataPaperDate↕Code
1GIT, Single Model63.12NoGIT: A Generative Image-to-text Transformer for ...2022-05-27Code
2CoCa - Google Brain62.52No---
3Microsoft Cognitive Services team61.2NoScaling Up Vision-Language Pre-training for Imag...2021-11-24-
4Prismer60.55NoPrismer: A Vision-Language Model with Multi-Task...2023-03-04Code
5Single Model59.86NoSimVLM: Simple Visual Language Model Pretraining...2021-08-24Code
6FudanFVL59.82No---
7FudanWYZ59.18No---
8IEDA-LAB58.56No---
9firethehole58.25No---
10MD57.57No---
11vll@mk51457.4No---
12VinVL (Microsoft Cognitive Services + MSR)56.96NoVinVL: Revisiting Visual Representations in Visi...2021-01-02Code
13ViTCAP-CIDEr-136.7-ENC-DEC-ViTbfocal10-test-CBS56.7No---
14icgp2ssi1_coco_si_0.02_5_test55.03No---
15evertyhing54.75No---
16camel XE54.3No---
17Oscar54.07No---
18RCAL53.85No---
19vinvl_yuan_cbs53.8No---
20Human52.83No---
21cxy_nocaps_training52.54No---
22MQ-UpDown-C52.53No---
23Xinyi52.35No---
24UpDown + ELMo + CBS51.82No---
25nocaps_training50.92No---
26UpDown50.92No---
277_10-7_40000_predict_test.json50.4No---
28B249.97No---
29None49.64No---
30YX49.38No---
31area_attention49.03No---
32Neural Baby Talk48.87No---
33Neural Baby Talk + CBS48.74No---
34coco_all_1947.6No---
35Yu-Wu46.61No---
36CS395T46.58No---