TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Image Captioning/COCO (Common Objects in Context)

Image Captioning on COCO (Common Objects in Context)

Metric: CIDEr (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕CIDEr▼Extra DataPaperDate↕Code
1ExpansionNet v2143.7NoExploiting Multiple Sequence Lengths in Fast End...2022-08-13Code
2M2 Transformer131.2NoMeshed-Memory Transformer for Image Captioning2019-12-17Code
3IGINet131No---
4UNIMO-large127.7NoUNIMO: Towards Unified-Modal Understanding and G...2020-12-31Code
5RDN125.2NoReflective Decoding Network for Image Captioning2019-08-30-
6Lyrics121.1NoLyrics: Boosting Fine-grained Language-Vision Al...2023-12-08-
7Bit Diffusion (20 steps)115NoAnalog Bits: Generating Discrete Data using Diff...2022-08-08Code
8Flamingo (80B; 4-shot)103NoRetrieval-Augmented Multimodal Language Modeling2022-11-22-
9RA-CM3 (2.7B)89.1NoRetrieval-Augmented Multimodal Language Modeling2022-11-22-
10Flamingo (3B; 4-shot)85NoRetrieval-Augmented Multimodal Language Modeling2022-11-22-
11Perturb, Predict & Paraphrase84.5No--Code
12Parti83.9NoRetrieval-Augmented Multimodal Language Modeling2022-11-22-
13NIC (ResNet-50, CutMix)77.6NoCutMix: Regularization Strategy to Train Strong ...2019-05-13Code
14Vanilla CM371.9NoRetrieval-Augmented Multimodal Language Modeling2022-11-22-
15X-LXMERT55.8NoRetrieval-Augmented Multimodal Language Modeling2022-11-22-
16minDALL-E48NoRetrieval-Augmented Multimodal Language Modeling2022-11-22-
17ruDALL-E-XL38.7NoRetrieval-Augmented Multimodal Language Modeling2022-11-22-
18DALL-E20.2NoRetrieval-Augmented Multimodal Language Modeling2022-11-22-