Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Image Retrieval
/
Fashion IQ
Image Retrieval on Fashion IQ
Metric: (Recall@10+Recall@50)/2 (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
(Recall@10+Recall@50)/2 (best first)
(Recall@10+Recall@50)/2 (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
(Recall@10+Recall@50)/2
▼
Extra Data
Paper
Date
↕
Code
1
DQU-CIR
71.77
No
-
-
-
2
TMCIR
66.56
No
TMCIR: Token Merge Benefits Composed Image Retri...
2025-04-15
-
3
SPN4CIR (SPRC)
66.41
No
Improving Composed Image Retrieval via Contrasti...
2024-04-17
Code
4
SPRC
64.85
Yes
Sentence-level Prompts Benefit Composed Image Re...
2023-10-09
Code
5
Candidate Set Re-ranking
62.15
No
Candidate Set Re-ranking for Composed Image Retr...
2023-05-25
Code
6
RUTIR (BLIP B/16)
61.32
No
Ranking-aware Uncertainty for Text-guided Image ...
2023-08-16
-
7
CoVR-BLIP-2
60.57
No
CoVR-2: Automatic Data Construction for Composed...
2023-08-28
Code
8
CASE
59.73
No
Data Roaming and Quality Assessment for Composed...
2023-03-16
Code
9
CaLa
57.96
No
CaLa: Complementary Association Learning for Aug...
2024-05-29
Code
10
RTD + LinCIR (CLIP G/14)
56.74
No
An Efficient Post-hoc Framework for Reducing Tas...
2024-06-13
Code
11
BLIP4CIR+Bi
55.4
No
Bi-directional Training for Composed Image Retri...
2023-03-29
Code
12
LinCIR (CLIP G/14)
55.4
No
Language-only Efficient Training of Zero-shot Co...
2023-12-04
Code
13
CLIP4Cir (v3)
55.36
No
Composed Image Retrieval using Contrastive Learn...
2023-08-22
Code
14
RUTIR (CLIP ResNet50)
55.27
No
Ranking-aware Uncertainty for Text-guided Image ...
2023-08-16
-
15
SEIZE (CLIP G/14)
54.45
No
-
-
Code
16
Css-Net
51.34
No
Collaborative Group: Composed Image Retrieval vi...
2023-06-03
-
17
MUR (4*ResNet50)
50.61
No
Composed Image Retrieval with Text Feedback via ...
2022-11-14
Code
18
CLIP4Cir (v2)
50.03
No
-
-
Code
19
CoLLM (finetuned - BLIP-L/16)
49.9
No
CoLLM: A Large Language Model for Composed Image...
2025-03-25
Code
20
SCOT (WACV 2025)
49.24
No
SCOT: Self-Supervised Contrastive Pretraining Fo...
2025-01-12
-
21
CoVR-BLIP-2
48.3
No
CoVR-2: Automatic Data Construction for Composed...
2023-08-28
Code
22
MagicLens (CoCa L)
48.1
No
MagicLens: Self-Supervised Image Retrieval with ...
2024-03-28
Code
23
OSrCIR (CLIP G/14)
47.34
No
Reason-before-Retrieve: One-Stage Reflective Cha...
2024-12-15
Code
24
MUR
47.28
No
Composed Image Retrieval with Text Feedback via ...
2022-11-14
Code
25
CLIP4Cir
47.21
No
-
-
Code
26
WeiMoCIR (CLIP G/14)
47.16
No
Training-free Zero-shot Composed Image Retrieval...
2024-09-07
Code
27
MTCIR (CLIP L/14)
46.42
No
Pretrain like Your Inference: Masked Tuning Impr...
2023-11-13
Code
28
MMRet-MLLM
46.1
No
MegaPairs: Massive Data Synthesis For Universal ...
2024-12-19
Code
29
CompoDiff (CLIP G/14)
45.37
No
CompoDiff: Versatile Composed Image Retrieval Wi...
2023-03-21
Code
30
CoLLM (Pretrained - BLIP-L/16)
45.3
No
CoLLM: A Large Language Model for Composed Image...
2025-03-25
Code
31
MagicLens (CoCa B)
45.3
No
MagicLens: Self-Supervised Image Retrieval with ...
2024-03-28
Code
32
TransAgg (Laion-CIR-Combined)
44.75
No
Zero-shot Composed Text-Image Retrieval
2023-06-12
Code
33
WeiMoCIR (CLIP H/14)
44.58
No
Training-free Zero-shot Composed Image Retrieval...
2024-09-07
Code
34
CompoDiff (CLIP L/14)
44.11
No
CompoDiff: Versatile Composed Image Retrieval Wi...
2023-03-21
Code
35
LDRE (CLIP G/14)
43.98
No
-
-
Code
36
OSrCIR (CLIP B/32)
42.87
No
Reason-before-Retrieve: One-Stage Reflective Cha...
2024-12-15
Code
37
OSrCIR (CLIP L/14)
42.82
No
Reason-before-Retrieve: One-Stage Reflective Cha...
2024-12-15
Code
38
CIReVL (CLIP G/14)
42.28
No
Vision-by-Language for Training-Free Composition...
2023-10-13
Code
39
MagicLens (CLIP L)
41.6
No
MagicLens: Self-Supervised Image Retrieval with ...
2024-03-28
Code
40
WeiMoCIR (CLIP L/14)
41.27
No
Training-free Zero-shot Composed Image Retrieval...
2024-09-07
Code
41
RTD + LinCIR (CLIP L/14)
40.66
No
An Efficient Post-hoc Framework for Reducing Tas...
2024-06-13
Code
42
RTIC-GCN
40.64
No
RTIC: Residual Learning for Text and Image Compo...
2021-04-07
Code
43
WeiMoCIR (CLIP B/32)
39.84
No
Training-free Zero-shot Composed Image Retrieval...
2024-09-07
Code
44
CoLLM (Pretrained - CLIP-L/14)
39.8
No
CoLLM: A Large Language Model for Composed Image...
2025-03-25
Code
45
CoSMo
39.45
No
-
-
Code
46
iSEARLE-XL-OTI (CLIP L/14)
39.39
No
iSEARLE: Improving Textual Inversion for Zero-Sh...
2024-05-05
Code
47
CIReVL (CLIP B/32)
38.82
No
Vision-by-Language for Training-Free Composition...
2023-10-13
Code
48
CIReVL (CLIP L/14)
38.56
No
Vision-by-Language for Training-Free Composition...
2023-10-13
Code
49
CurlingNet
38.45
No
CurlingNet: Compositional Learning between Image...
2020-03-27
Code
50
Context-I2W (CLIP L/14)
38.35
No
Context-I2W: Mapping Images to Context-dependent...
2023-09-28
Code
51
iSEARLE-XL (CLIP L/14)
38.24
No
iSEARLE: Improving Textual Inversion for Zero-Sh...
2024-05-05
Code
52
SEARLE-XL-OTI (CLIP L/14)
37.76
No
Zero-Shot Composed Image Retrieval with Textual ...
2023-03-27
Code
53
MagicLens (CLIP B)
36.85
No
MagicLens: Self-Supervised Image Retrieval with ...
2024-03-28
Code
54
LinCIR (CLIP L/14)
36.39
No
Language-only Efficient Training of Zero-shot Co...
2023-12-04
Code
55
SEARLE-XL (CLIP L/14)
35.9
No
Zero-Shot Composed Image Retrieval with Textual ...
2023-03-27
Code
56
VAL w/ GloVe
35.38
No
-
-
Code
57
iSEARLE-OTI (CLIP B/32)
34.93
No
iSEARLE: Improving Textual Inversion for Zero-Sh...
2024-05-05
Code
58
iSEARLE (CLIP B/32)
34.6
No
iSEARLE: Improving Textual Inversion for Zero-Sh...
2024-05-05
Code
59
Pic2Word
34.2
No
Pic2Word: Mapping Pictures to Words for Zero-sho...
2023-02-06
Code
60
SEARLE (CLIP B/32)
32.71
No
Zero-Shot Composed Image Retrieval with Textual ...
2023-03-27
Code
61
SEARLE-OTI (CLIP B/32)
32.39
No
Zero-Shot Composed Image Retrieval with Textual ...
2023-03-27
Code
62
PALAVRA
28.51
No
"This is my unicorn, Fluffy": Personalizing froz...
2022-04-04
Code
63
ComposeAE
20.6
No
Compositional Learning of Image-Text Query for I...
2020-06-19
Code
#1
DQU-CIR
71.77
(Recall@10+Recall@50)/2
No paper
#2
TMCIR
SOTA
66.56
(Recall@10+Recall@50)/2
· 2025-04-15
TMCIR: Token Merge Benefits Composed Image Retrieval
#3
SPN4CIR (SPRC)
SOTA
66.41
(Recall@10+Recall@50)/2
· 2024-04-17
Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives
Code
#4
SPRC
SOTA
64.85
(Recall@10+Recall@50)/2
· Extra Data
· 2023-10-09
Sentence-level Prompts Benefit Composed Image Retrieval
Code
#5
Candidate Set Re-ranking
SOTA
62.15
(Recall@10+Recall@50)/2
· 2023-05-25
Candidate Set Re-ranking for Composed Image Retrieval with Dual Multi-modal Encoder
Code
#6
RUTIR (BLIP B/16)
61.32
(Recall@10+Recall@50)/2
· 2023-08-16
Ranking-aware Uncertainty for Text-guided Image Retrieval
#7
CoVR-BLIP-2
60.57
(Recall@10+Recall@50)/2
· 2023-08-28
CoVR-2: Automatic Data Construction for Composed Video Retrieval
Code
#8
CASE
SOTA
59.73
(Recall@10+Recall@50)/2
· 2023-03-16
Data Roaming and Quality Assessment for Composed Image Retrieval
Code
#9
CaLa
57.96
(Recall@10+Recall@50)/2
· 2024-05-29
CaLa: Complementary Association Learning for Augmenting Composed Image Retrieval
Code
#10
RTD + LinCIR (CLIP G/14)
56.74
(Recall@10+Recall@50)/2
· 2024-06-13
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image Retrieval
Code
#11
BLIP4CIR+Bi
55.4
(Recall@10+Recall@50)/2
· 2023-03-29
Bi-directional Training for Composed Image Retrieval via Text Prompt Learning
Code
#12
LinCIR (CLIP G/14)
55.4
(Recall@10+Recall@50)/2
· 2023-12-04
Language-only Efficient Training of Zero-shot Composed Image Retrieval
Code
#13
CLIP4Cir (v3)
55.36
(Recall@10+Recall@50)/2
· 2023-08-22
Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features
Code
#14
RUTIR (CLIP ResNet50)
55.27
(Recall@10+Recall@50)/2
· 2023-08-16
Ranking-aware Uncertainty for Text-guided Image Retrieval
#15
SEIZE (CLIP G/14)
54.45
(Recall@10+Recall@50)/2
No paper
Code
#16
Css-Net
51.34
(Recall@10+Recall@50)/2
· 2023-06-03
Collaborative Group: Composed Image Retrieval via Consensus Learning from Noisy Annotations
#17
MUR (4*ResNet50)
SOTA
50.61
(Recall@10+Recall@50)/2
· 2022-11-14
Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization
Code
#18
CLIP4Cir (v2)
50.03
(Recall@10+Recall@50)/2
No paper
Code
#19
CoLLM (finetuned - BLIP-L/16)
49.9
(Recall@10+Recall@50)/2
· 2025-03-25
CoLLM: A Large Language Model for Composed Image Retrieval
Code
#20
SCOT (WACV 2025)
49.24
(Recall@10+Recall@50)/2
· 2025-01-12
SCOT: Self-Supervised Contrastive Pretraining For Zero-Shot Compositional Retrieval
#21
CoVR-BLIP-2
48.3
(Recall@10+Recall@50)/2
· 2023-08-28
CoVR-2: Automatic Data Construction for Composed Video Retrieval
Code
#22
MagicLens (CoCa L)
48.1
(Recall@10+Recall@50)/2
· 2024-03-28
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Code
#23
OSrCIR (CLIP G/14)
47.34
(Recall@10+Recall@50)/2
· 2024-12-15
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
Code
#24
MUR
47.28
(Recall@10+Recall@50)/2
· 2022-11-14
Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization
Code
#25
CLIP4Cir
47.21
(Recall@10+Recall@50)/2
No paper
Code
#26
WeiMoCIR (CLIP G/14)
47.16
(Recall@10+Recall@50)/2
· 2024-09-07
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity
Code
#27
MTCIR (CLIP L/14)
46.42
(Recall@10+Recall@50)/2
· 2023-11-13
Pretrain like Your Inference: Masked Tuning Improves Zero-Shot Composed Image Retrieval
Code
#28
MMRet-MLLM
46.1
(Recall@10+Recall@50)/2
· 2024-12-19
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
Code
#29
CompoDiff (CLIP G/14)
45.37
(Recall@10+Recall@50)/2
· 2023-03-21
CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion
Code
#30
CoLLM (Pretrained - BLIP-L/16)
45.3
(Recall@10+Recall@50)/2
· 2025-03-25
CoLLM: A Large Language Model for Composed Image Retrieval
Code
#31
MagicLens (CoCa B)
45.3
(Recall@10+Recall@50)/2
· 2024-03-28
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Code
#32
TransAgg (Laion-CIR-Combined)
44.75
(Recall@10+Recall@50)/2
· 2023-06-12
Zero-shot Composed Text-Image Retrieval
Code
#33
WeiMoCIR (CLIP H/14)
44.58
(Recall@10+Recall@50)/2
· 2024-09-07
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity
Code
#34
CompoDiff (CLIP L/14)
44.11
(Recall@10+Recall@50)/2
· 2023-03-21
CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion
Code
#35
LDRE (CLIP G/14)
43.98
(Recall@10+Recall@50)/2
No paper
Code
#36
OSrCIR (CLIP B/32)
42.87
(Recall@10+Recall@50)/2
· 2024-12-15
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
Code
#37
OSrCIR (CLIP L/14)
42.82
(Recall@10+Recall@50)/2
· 2024-12-15
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
Code
#38
CIReVL (CLIP G/14)
42.28
(Recall@10+Recall@50)/2
· 2023-10-13
Vision-by-Language for Training-Free Compositional Image Retrieval
Code
#39
MagicLens (CLIP L)
41.6
(Recall@10+Recall@50)/2
· 2024-03-28
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Code
#40
WeiMoCIR (CLIP L/14)
41.27
(Recall@10+Recall@50)/2
· 2024-09-07
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity
Code
#41
RTD + LinCIR (CLIP L/14)
40.66
(Recall@10+Recall@50)/2
· 2024-06-13
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image Retrieval
Code
#42
RTIC-GCN
SOTA
40.64
(Recall@10+Recall@50)/2
· 2021-04-07
RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network
Code
#43
WeiMoCIR (CLIP B/32)
39.84
(Recall@10+Recall@50)/2
· 2024-09-07
Training-free Zero-shot Composed Image Retrieval via Weighted Modality Fusion and Similarity
Code
#44
CoLLM (Pretrained - CLIP-L/14)
39.8
(Recall@10+Recall@50)/2
· 2025-03-25
CoLLM: A Large Language Model for Composed Image Retrieval
Code
#45
CoSMo
39.45
(Recall@10+Recall@50)/2
No paper
Code
#46
iSEARLE-XL-OTI (CLIP L/14)
39.39
(Recall@10+Recall@50)/2
· 2024-05-05
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval
Code
#47
CIReVL (CLIP B/32)
38.82
(Recall@10+Recall@50)/2
· 2023-10-13
Vision-by-Language for Training-Free Compositional Image Retrieval
Code
#48
CIReVL (CLIP L/14)
38.56
(Recall@10+Recall@50)/2
· 2023-10-13
Vision-by-Language for Training-Free Compositional Image Retrieval
Code
#49
CurlingNet
SOTA
38.45
(Recall@10+Recall@50)/2
· 2020-03-27
CurlingNet: Compositional Learning between Images and Text for Fashion IQ Data
Code
#50
Context-I2W (CLIP L/14)
38.35
(Recall@10+Recall@50)/2
· 2023-09-28
Context-I2W: Mapping Images to Context-dependent Words for Accurate Zero-Shot Composed Image Retrieval
Code
#51
iSEARLE-XL (CLIP L/14)
38.24
(Recall@10+Recall@50)/2
· 2024-05-05
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval
Code
#52
SEARLE-XL-OTI (CLIP L/14)
37.76
(Recall@10+Recall@50)/2
· 2023-03-27
Zero-Shot Composed Image Retrieval with Textual Inversion
Code
#53
MagicLens (CLIP B)
36.85
(Recall@10+Recall@50)/2
· 2024-03-28
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Code
#54
LinCIR (CLIP L/14)
36.39
(Recall@10+Recall@50)/2
· 2023-12-04
Language-only Efficient Training of Zero-shot Composed Image Retrieval
Code
#55
SEARLE-XL (CLIP L/14)
35.9
(Recall@10+Recall@50)/2
· 2023-03-27
Zero-Shot Composed Image Retrieval with Textual Inversion
Code
#56
VAL w/ GloVe
35.38
(Recall@10+Recall@50)/2
No paper
Code
#57
iSEARLE-OTI (CLIP B/32)
34.93
(Recall@10+Recall@50)/2
· 2024-05-05
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval
Code
#58
iSEARLE (CLIP B/32)
34.6
(Recall@10+Recall@50)/2
· 2024-05-05
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval
Code
#59
Pic2Word
34.2
(Recall@10+Recall@50)/2
· 2023-02-06
Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval
Code
#60
SEARLE (CLIP B/32)
32.71
(Recall@10+Recall@50)/2
· 2023-03-27
Zero-Shot Composed Image Retrieval with Textual Inversion
Code
#61
SEARLE-OTI (CLIP B/32)
32.39
(Recall@10+Recall@50)/2
· 2023-03-27
Zero-Shot Composed Image Retrieval with Textual Inversion
Code
#62
PALAVRA
28.51
(Recall@10+Recall@50)/2
· 2022-04-04
"This is my unicorn, Fluffy": Personalizing frozen vision-language representations
Code
#63
ComposeAE
20.6
(Recall@10+Recall@50)/2
· 2020-06-19
Compositional Learning of Image-Text Query for Image Retrieval
Code