Image Retrieval on PhotoChat

Metric: R@10 (higher is better)

LeaderboardDataset

Loading chart...

Results

Submit a result

Sort:

#	Model↕	R@10▼	Extra Data	Paper	Date↕	Code
1	PaCE	49.6	No	PaCE: Unified Multi-modal Dialogue Pre-training ...	2023-05-24	Code
2	VLMo	39.4	No	VLMo: Unified Vision-Language Pre-Training with ...	2021-11-03	Code
3	SCAN	37.1	No	Stacked Cross Attention for Image-Text Matching	2018-03-21	Code
4	DE++	35.7	No	PhotoChat: A Human-Human Dialogue Dataset with P...	2021-07-06	-
5	ViLT	25.6	No	ViLT: Vision-and-Language Transformer Without Co...	2021-02-05	Code

#1PaCESOTA
49.6
R@10· 2023-05-24
PaCE: Unified Multi-modal Dialogue Pre-training with Progressive and Compositional Experts Code
#2VLMoSOTA
39.4
R@10· 2021-11-03
VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts Code
#3SCANSOTA
37.1
R@10· 2018-03-21
Stacked Cross Attention for Image-Text Matching Code
#4DE++
35.7
R@10· 2021-07-06
PhotoChat: A Human-Human Dialogue Dataset with Photo Sharing Behavior for Joint Image-Text Modeling
#5ViLT
25.6
R@10· 2021-02-05
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision Code