TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Image-to-Text Retrieval/Flickr30k

Image-to-Text Retrieval on Flickr30k

Metric: Recall@10 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Recall@10▼Extra DataPaperDate↕Code
1InternVL-G-FT (finetuned, w/o ranking)100NoInternVL: Scaling up Vision Foundation Models an...2023-12-21Code
2BLIP-2 ViT-G (zero-shot, 1K test set)100NoBLIP-2: Bootstrapping Language-Image Pre-trainin...2023-01-30Code
3ONE-PEACE (finetuned, w/o ranking)100NoONE-PEACE: Exploring One General Representation ...2023-05-18Code
4InternVL-C-FT (finetuned, w/o ranking)100NoInternVL: Scaling up Vision Foundation Models an...2023-12-21Code
5BLIP-2 ViT-L (zero-shot, 1K test set)100NoBLIP-2: Bootstrapping Language-Image Pre-trainin...2023-01-30Code
6ERNIE-ViL 2.0100NoERNIE-ViL 2.0: Multi-view Contrastive Learning f...2022-09-30Code
7ALBEF100NoAlign before Fuse: Vision and Language Represent...2021-07-16Code
8ALBEF99.9NoHADA: A Graph-based Amalgamation Framework in Im...2023-01-11Code
9UNITER99.2NoHADA: A Graph-based Amalgamation Framework in Im...2023-01-11Code
10GSMN97.3NoA Deep Local and Global Scene-Graph Matching for...2021-06-04Code
11LGSGM96.1NoA Deep Local and Global Scene-Graph Matching for...2021-06-04Code