TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Image Retrieval/Flickr30K 1K test

Image Retrieval on Flickr30K 1K test

Metric: R@10 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕R@10▼Extra DataPaperDate↕Code
1X-VLM (base)98.7YesMulti-Grained Vision Language Pre-Training: Alig...2021-11-16Code
2RCAR91.1NoPlug-and-Play Regulators for Image-Text Matching2023-03-23Code
3LGSGM90.2NoA Deep Local and Global Scene-Graph Matching for...2021-06-04Code
4TERAN Symm.89.3NoFine-grained Visual Textual Alignment for Cross-...2020-08-12Code
5SGRAF88.8NoSimilarity Reasoning and Filtration for Image-Te...2021-01-05Code
6TERAN MrSw88.2NoFine-grained Visual Textual Alignment for Cross-...2020-08-12Code
7VSRN88.2NoVisual Semantic Reasoning for Image-Text Matching2019-09-06Code
8VisualSparta88.1NoVisualSparta: An Embarrassingly Simple Approach ...2021-01-01Code
9CAMP85.3NoCAMP: Cross-Modal Adaptive Message Passing for T...2019-09-12Code
10SCAN i-t82.6NoStacked Cross Attention for Image-Text Matching2018-03-21Code
11SCO80.1NoLearning Semantic Concepts and Order for Image a...2017-12-06-
12DAN79.1NoDual Attention Networks for Multimodal Reasoning...2016-11-02Code
13SM-LSTM (VGG)72.3NoInstance-aware Image and Sentence Matching with ...2016-11-17-
14SPE72.1NoLearning Deep Structure-Preserving Image-Text Em...2015-11-19-
15mCNN69.6NoMultimodal Convolutional Neural Networks for Mat...2015-04-23Code
16HGLMM FV66.8NoFlickr30k Entities: Collecting Region-to-Phrase ...2015-05-19Code
17DVSA (R-CNN, AlexNet)50.5NoDeep Visual-Semantic Alignments for Generating I...2014-12-07Code