Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Image Retrieval
/
Flickr30K 1K test
Image Retrieval on Flickr30K 1K test
Metric: R@5 (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
R@5
▼
Extra Data
Paper
Date
↕
Code
1
X-VLM (base)
97.3
Yes
Multi-Grained Vision Language Pre-Training: Alig...
2021-11-16
Code
2
RCAR
85.8
No
Plug-and-Play Regulators for Image-Text Matching
2023-03-23
Code
3
LGSGM
84.1
No
A Deep Local and Global Scene-Graph Matching for...
2021-06-04
Code
4
TERAN Symm.
83.1
No
Fine-grained Visual Textual Alignment for Cross-...
2020-08-12
Code
5
SGRAF
83
No
Similarity Reasoning and Filtration for Image-Te...
2021-01-05
Code
6
VisualSparta
82
No
VisualSparta: An Embarrassingly Simple Approach ...
2021-01-01
Code
7
VSRN
81.8
No
Visual Semantic Reasoning for Image-Text Matching
2019-09-06
Code
8
TERAN MrSw
81.2
No
Fine-grained Visual Textual Alignment for Cross-...
2020-08-12
Code
9
CAMP
77.1
No
CAMP: Cross-Modal Adaptive Message Passing for T...
2019-09-12
Code
10
SCAN i-t
74.2
No
Stacked Cross Attention for Image-Text Matching
2018-03-21
Code
11
SCO
70.5
No
Learning Semantic Concepts and Order for Image a...
2017-12-06
-
12
DAN
69.2
No
Dual Attention Networks for Multimodal Reasoning...
2016-11-02
Code
13
SPE
60.1
No
Learning Deep Structure-Preserving Image-Text Em...
2015-11-19
-
14
mCNN
56.3
No
Multimodal Convolutional Neural Networks for Mat...
2015-04-23
Code
15
HGLMM FV
53.4
No
Flickr30k Entities: Collecting Region-to-Phrase ...
2015-05-19
Code