Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Image Retrieval
/
CREPE (Compositional REPresentation Evaluation)
Image Retrieval on CREPE (Compositional REPresentation Evaluation)
Metric: Recall@1 (HN-Atom, UC) (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
Recall@1 (HN-Atom, UC) (best first)
Recall@1 (HN-Atom, UC) (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Recall@1 (HN-Atom, UC)
▼
Extra Data
Paper
Date
↕
Code
1
ViT-L-14 (LAION400M)
47.86
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
2
ViT-B-16+240 (LAION400M)
46.53
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
3
ViT-B-16 (LAION400M)
44.93
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
4
Swin-T (MosaiCLIP, CC-12M)
44.5
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
5
RN-50 (MosaiCLIP, CC-12M)
44.4
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
6
ViT-B-32 (LAION400M)
42.75
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
7
MosaiCLIP (YFCC-FT)
41.5
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
8
RN-50 (NegCLIP, CC-12M)
41.4
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
9
MosaiCLIP (CC-FT)
40.9
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
10
RN50 (YFCC15M)
39.85
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
11
Swin-T (NegCLIP, CC-12M)
39.6
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
12
RN101 (YFCC15M)
39.5
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
13
CLIP (YFCC-FT)
39.5
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
14
NegCLIP (YFCC-FT)
39
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
15
CLIP-FT (YFCC-FT)
38.3
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
16
NegCLIP (CC-FT)
37.5
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
17
Swin-T (CLIP, CC-12M)
37.3
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
18
RN-50 (CLIP, CC-12M)
36.7
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
19
CLIP-FT (CC-FT)
35.6
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
20
CLIP (CC-FT)
35
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
21
RN50 (CC12M)
34.88
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
22
Random
20
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
#1
ViT-L-14 (LAION400M)
SOTA
47.86
Recall@1 (HN-Atom, UC)
· 2022-12-13
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Code
#2
ViT-B-16+240 (LAION400M)
46.53
Recall@1 (HN-Atom, UC)
· 2022-12-13
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Code
#3
ViT-B-16 (LAION400M)
44.93
Recall@1 (HN-Atom, UC)
· 2022-12-13
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Code
#4
Swin-T (MosaiCLIP, CC-12M)
44.5
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#5
RN-50 (MosaiCLIP, CC-12M)
44.4
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#6
ViT-B-32 (LAION400M)
42.75
Recall@1 (HN-Atom, UC)
· 2022-12-13
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Code
#7
MosaiCLIP (YFCC-FT)
41.5
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#8
RN-50 (NegCLIP, CC-12M)
41.4
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#9
MosaiCLIP (CC-FT)
40.9
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#10
RN50 (YFCC15M)
39.85
Recall@1 (HN-Atom, UC)
· 2022-12-13
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Code
#11
Swin-T (NegCLIP, CC-12M)
39.6
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#12
RN101 (YFCC15M)
39.5
Recall@1 (HN-Atom, UC)
· 2022-12-13
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Code
#13
CLIP (YFCC-FT)
39.5
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#14
NegCLIP (YFCC-FT)
39
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#15
CLIP-FT (YFCC-FT)
38.3
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#16
NegCLIP (CC-FT)
37.5
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#17
Swin-T (CLIP, CC-12M)
37.3
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#18
RN-50 (CLIP, CC-12M)
36.7
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#19
CLIP-FT (CC-FT)
35.6
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#20
CLIP (CC-FT)
35
Recall@1 (HN-Atom, UC)
· 2023-05-23
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality
#21
RN50 (CC12M)
34.88
Recall@1 (HN-Atom, UC)
· 2022-12-13
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Code
#22
Random
20
Recall@1 (HN-Atom, UC)
· 2022-12-13
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Code