Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Image Retrieval
/
CIRCO
Image Retrieval on CIRCO
Metric: mAP@10 (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
mAP@10 (best first)
mAP@10 (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
mAP@10
▼
Extra Data
Paper
Date
↕
Code
1
MMRet-MLLM
43.4
No
MegaPairs: Massive Data Synthesis For Universal ...
2024-12-19
Code
2
MMRet-Large (CLIP L/14)
40.2
No
MegaPairs: Massive Data Synthesis For Universal ...
2024-12-19
Code
3
SCOT (WACV 2025)
37.88
No
SCOT: Self-Supervised Contrastive Pretraining Fo...
2025-01-12
-
4
SEIZE (CLIP G/14 & GPT-4o)
37.23
No
-
-
Code
5
MagicLens (CoCa L)
35.4
No
MagicLens: Self-Supervised Image Retrieval with ...
2024-03-28
Code
6
MMRet-Base (CLIP B/16)
35
No
MegaPairs: Massive Data Synthesis For Universal ...
2024-12-19
Code
7
IP-CIR + LDRE (CLIP G/14)
34.26
No
Imagine and Seek: Improving Composed Image Retri...
2024-11-24
-
8
SEIZE (CLIP G/14)
33.77
No
-
-
Code
9
LDRE (CLIP G/14)
32.24
No
-
-
Code
10
MagicLens (CoCa B)
32
No
MagicLens: Self-Supervised Image Retrieval with ...
2024-03-28
Code
11
OSrCIR (CLIP G/14)
31.14
No
Reason-before-Retrieve: One-Stage Reflective Cha...
2024-12-15
Code
12
MagicLens (CLIP L)
30.8
No
MagicLens: Self-Supervised Image Retrieval with ...
2024-03-28
Code
13
CoVR-BLIP-2
29.55
No
CoVR-2: Automatic Data Construction for Composed...
2023-08-28
Code
14
ImageScope (CLIP-ViT-L/14)
29.23
No
ImageScope: Unifying Language-Guided Image Retri...
2025-03-13
Code
15
CIReVL (CLIP G/14)
27.59
No
Vision-by-Language for Training-Free Composition...
2023-10-13
Code
16
IP-CIR + LDRE (CLIP L/14)
27.41
No
Imagine and Seek: Improving Composed Image Retri...
2024-11-24
-
17
SEIZE (CLIP L/14)
25.82
No
-
-
Code
18
OSrCIR (CLIP L/14)
25.33
No
Reason-before-Retrieve: One-Stage Reflective Cha...
2024-12-15
Code
19
LDRE (CLIP L/14)
24.03
No
-
-
Code
20
MagicLens (CLIP B)
23.8
No
MagicLens: Self-Supervised Image Retrieval with ...
2024-03-28
Code
21
RTD + LinCIR (CLIP G/14)
22.29
No
An Efficient Post-hoc Framework for Reducing Tas...
2024-06-13
Code
22
LinCIR (CLIP G/14)
21.01
No
Language-only Efficient Training of Zero-shot Co...
2023-12-04
Code
23
CoLLM (Pretrained - CLIP-L/14)
20.8
No
CoLLM: A Large Language Model for Composed Image...
2025-03-25
Code
24
CoLLM (Pretrained - BLIP-L/16)
20.4
No
CoLLM: A Large Language Model for Composed Image...
2025-03-25
Code
25
OSrCIR (CLIP B/32)
19.17
No
Reason-before-Retrieve: One-Stage Reflective Cha...
2024-12-15
Code
26
CIReVL (CLIP L/14)
19.01
No
Vision-by-Language for Training-Free Composition...
2023-10-13
Code
27
LDRE (CLIP B/32)
18.32
No
-
-
Code
28
RTD + LinCIR (CLIP L/14)
18.11
No
An Efficient Post-hoc Framework for Reducing Tas...
2024-06-13
Code
29
CompoDiff (CLIP G/14)
17.71
No
CompoDiff: Versatile Composed Image Retrieval Wi...
2023-03-21
Code
30
CIReVL (CLIP B/32)
15.42
No
Vision-by-Language for Training-Free Composition...
2023-10-13
Code
31
Context-I2W
14.62
No
Context-I2W: Mapping Images to Context-dependent...
2023-09-28
Code
32
iSEARLE-XL (CLIP L/14)
13.61
No
iSEARLE: Improving Textual Inversion for Zero-Sh...
2024-05-05
Code
33
LinCIR (CLIP L/14)
13.58
No
Language-only Efficient Training of Zero-shot Co...
2023-12-04
Code
34
CompoDiff (CLIP L/14)
13.51
No
CompoDiff: Versatile Composed Image Retrieval Wi...
2023-03-21
Code
35
SEARLE-XL (CLIP L/14)
12.73
No
Zero-Shot Composed Image Retrieval with Textual ...
2023-03-27
Code
36
iSEARLE-XL-OTI (CLIP L/14)
12.67
No
iSEARLE: Improving Textual Inversion for Zero-Sh...
2024-05-05
Code
37
MTCIR (CLIP L/14)
11.63
No
Pretrain like Your Inference: Masked Tuning Impr...
2023-11-13
Code
38
iSEARLE (CLIP B/32)
11.24
No
iSEARLE: Improving Textual Inversion for Zero-Sh...
2024-05-05
Code
39
iSEARLE-OTI (CLIP B/32)
10.94
No
iSEARLE: Improving Textual Inversion for Zero-Sh...
2024-05-05
Code
40
SEARLE (CLIP B/32)
9.94
No
Zero-Shot Composed Image Retrieval with Textual ...
2023-03-27
Code
41
Pic2Word
9.51
No
Pic2Word: Mapping Pictures to Words for Zero-sho...
2023-02-06
Code
42
MTCIR (BLIP B/16)
8.03
No
Pretrain like Your Inference: Masked Tuning Impr...
2023-11-13
Code
43
PALAVRA
5.32
No
"This is my unicorn, Fluffy": Personalizing froz...
2022-04-04
Code
#1
MMRet-MLLM
SOTA
43.4
mAP@10
· 2024-12-19
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
Code
#2
MMRet-Large (CLIP L/14)
40.2
mAP@10
· 2024-12-19
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
Code
#3
SCOT (WACV 2025)
37.88
mAP@10
· 2025-01-12
SCOT: Self-Supervised Contrastive Pretraining For Zero-Shot Compositional Retrieval
#4
SEIZE (CLIP G/14 & GPT-4o)
37.23
mAP@10
No paper
Code
#5
MagicLens (CoCa L)
SOTA
35.4
mAP@10
· 2024-03-28
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Code
#6
MMRet-Base (CLIP B/16)
35
mAP@10
· 2024-12-19
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
Code
#7
IP-CIR + LDRE (CLIP G/14)
34.26
mAP@10
· 2024-11-24
Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy
#8
SEIZE (CLIP G/14)
33.77
mAP@10
No paper
Code
#9
LDRE (CLIP G/14)
32.24
mAP@10
No paper
Code
#10
MagicLens (CoCa B)
32
mAP@10
· 2024-03-28
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Code
#11
OSrCIR (CLIP G/14)
31.14
mAP@10
· 2024-12-15
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
Code
#12
MagicLens (CLIP L)
30.8
mAP@10
· 2024-03-28
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Code
#13
CoVR-BLIP-2
SOTA
29.55
mAP@10
· 2023-08-28
CoVR-2: Automatic Data Construction for Composed Video Retrieval
Code
#14
ImageScope (CLIP-ViT-L/14)
29.23
mAP@10
· 2025-03-13
ImageScope: Unifying Language-Guided Image Retrieval via Large Multimodal Model Collective Reasoning
Code
#15
CIReVL (CLIP G/14)
27.59
mAP@10
· 2023-10-13
Vision-by-Language for Training-Free Compositional Image Retrieval
Code
#16
IP-CIR + LDRE (CLIP L/14)
27.41
mAP@10
· 2024-11-24
Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy
#17
SEIZE (CLIP L/14)
25.82
mAP@10
No paper
Code
#18
OSrCIR (CLIP L/14)
25.33
mAP@10
· 2024-12-15
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
Code
#19
LDRE (CLIP L/14)
24.03
mAP@10
No paper
Code
#20
MagicLens (CLIP B)
23.8
mAP@10
· 2024-03-28
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Code
#21
RTD + LinCIR (CLIP G/14)
22.29
mAP@10
· 2024-06-13
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image Retrieval
Code
#22
LinCIR (CLIP G/14)
21.01
mAP@10
· 2023-12-04
Language-only Efficient Training of Zero-shot Composed Image Retrieval
Code
#23
CoLLM (Pretrained - CLIP-L/14)
20.8
mAP@10
· 2025-03-25
CoLLM: A Large Language Model for Composed Image Retrieval
Code
#24
CoLLM (Pretrained - BLIP-L/16)
20.4
mAP@10
· 2025-03-25
CoLLM: A Large Language Model for Composed Image Retrieval
Code
#25
OSrCIR (CLIP B/32)
19.17
mAP@10
· 2024-12-15
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
Code
#26
CIReVL (CLIP L/14)
19.01
mAP@10
· 2023-10-13
Vision-by-Language for Training-Free Compositional Image Retrieval
Code
#27
LDRE (CLIP B/32)
18.32
mAP@10
No paper
Code
#28
RTD + LinCIR (CLIP L/14)
18.11
mAP@10
· 2024-06-13
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image Retrieval
Code
#29
CompoDiff (CLIP G/14)
SOTA
17.71
mAP@10
· 2023-03-21
CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion
Code
#30
CIReVL (CLIP B/32)
15.42
mAP@10
· 2023-10-13
Vision-by-Language for Training-Free Compositional Image Retrieval
Code
#31
Context-I2W
14.62
mAP@10
· 2023-09-28
Context-I2W: Mapping Images to Context-dependent Words for Accurate Zero-Shot Composed Image Retrieval
Code
#32
iSEARLE-XL (CLIP L/14)
13.61
mAP@10
· 2024-05-05
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval
Code
#33
LinCIR (CLIP L/14)
13.58
mAP@10
· 2023-12-04
Language-only Efficient Training of Zero-shot Composed Image Retrieval
Code
#34
CompoDiff (CLIP L/14)
13.51
mAP@10
· 2023-03-21
CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion
Code
#35
SEARLE-XL (CLIP L/14)
12.73
mAP@10
· 2023-03-27
Zero-Shot Composed Image Retrieval with Textual Inversion
Code
#36
iSEARLE-XL-OTI (CLIP L/14)
12.67
mAP@10
· 2024-05-05
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval
Code
#37
MTCIR (CLIP L/14)
11.63
mAP@10
· 2023-11-13
Pretrain like Your Inference: Masked Tuning Improves Zero-Shot Composed Image Retrieval
Code
#38
iSEARLE (CLIP B/32)
11.24
mAP@10
· 2024-05-05
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval
Code
#39
iSEARLE-OTI (CLIP B/32)
10.94
mAP@10
· 2024-05-05
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval
Code
#40
SEARLE (CLIP B/32)
9.94
mAP@10
· 2023-03-27
Zero-Shot Composed Image Retrieval with Textual Inversion
Code
#41
Pic2Word
SOTA
9.51
mAP@10
· 2023-02-06
Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval
Code
#42
MTCIR (BLIP B/16)
8.03
mAP@10
· 2023-11-13
Pretrain like Your Inference: Masked Tuning Improves Zero-Shot Composed Image Retrieval
Code
#43
PALAVRA
SOTA
5.32
mAP@10
· 2022-04-04
"This is my unicorn, Fluffy": Personalizing frozen vision-language representations
Code