Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Phrase Grounding
/
Flickr30k Entities Test
Phrase Grounding on Flickr30k Entities Test
Metric: R@1 (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
R@1
▼
Extra Data
Paper
Date
↕
Code
1
GLIPv2
87.7
Yes
GLIPv2: Unifying Localization and Vision-Languag...
2022-06-12
Code
2
FIBER-B
87.4
Yes
Coarse-to-Fine Vision-Language Pre-training with...
2022-06-15
Code
3
GLIP
87.1
Yes
Grounded Language-Image Pre-training
2021-12-07
Code
4
PEVL
84.4
Yes
PEVL: Position-enhanced Pre-training and Prompt ...
2022-05-23
Code
5
MDETR-ENB5
84.3
Yes
MDETR -- Modulated Detection for End-to-End Mult...
2021-04-26
Code
6
DIGN
78.73
No
Disentangled Motif-aware Graph Learning for Phra...
2021-04-13
-
7
LCMCG
76.74
No
Learning Cross-modal Context Graph for Visual Gr...
2019-11-20
Code
8
Soft-Label Chain CRF (SL-CCRF)
74.69
No
Phrase Grounding by Soft-Label Chain Conditional...
2019-09-01
Code
9
DDPN (ResNet-101)
73.3
No
Rethinking Diversified and Discriminative Propos...
2018-05-09
Code
10
VisualBERT
71.33
No
VisualBERT: A Simple and Performant Baseline for...
2019-08-09
Code
11
BAN (Bottom-Up detector)
69.69
No
Bilinear Attention Networks
2018-05-21
Code
12
MCB
48.69
No
Multimodal Compact Bilinear Pooling for Visual Q...
2016-06-06
Code
13
GroundeR 100.0% annot.
48.38
No
Grounding of Textual Phrases in Images by Recons...
2015-11-12
Code
14
DSPE
43.89
No
Learning Deep Structure-Preserving Image-Text Em...
2015-11-19
-
15
CCA - Fast RCNN
41.77
No
Flickr30k Entities: Collecting Region-to-Phrase ...
2015-05-19
Code
16
CCA - VGG19
30.83
No
Flickr30k Entities: Collecting Region-to-Phrase ...
2015-05-19
Code
17
SCRC
27.8
No
Natural Language Object Retrieval
2015-11-13
Code
18
CCA
25.3
No
Flickr30k Entities: Collecting Region-to-Phrase ...
2015-05-19
Code