Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Image Retrieval
/
CREPE (Compositional REPresentation Evaluation)
Image Retrieval on CREPE (Compositional REPresentation Evaluation)
Metric: Recall@1 (HN-Atom, UC) (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
#
Model
↕
Recall@1 (HN-Atom, UC)
▼
Extra Data
Paper
Date
↕
Code
1
ViT-L-14 (LAION400M)
47.86
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
2
ViT-B-16+240 (LAION400M)
46.53
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
3
ViT-B-16 (LAION400M)
44.93
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
4
Swin-T (MosaiCLIP, CC-12M)
44.5
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
5
RN-50 (MosaiCLIP, CC-12M)
44.4
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
6
ViT-B-32 (LAION400M)
42.75
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
7
MosaiCLIP (YFCC-FT)
41.5
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
8
RN-50 (NegCLIP, CC-12M)
41.4
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
9
MosaiCLIP (CC-FT)
40.9
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
10
RN50 (YFCC15M)
39.85
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
11
Swin-T (NegCLIP, CC-12M)
39.6
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
12
RN101 (YFCC15M)
39.5
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
13
CLIP (YFCC-FT)
39.5
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
14
NegCLIP (YFCC-FT)
39
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
15
CLIP-FT (YFCC-FT)
38.3
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
16
NegCLIP (CC-FT)
37.5
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
17
Swin-T (CLIP, CC-12M)
37.3
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
18
RN-50 (CLIP, CC-12M)
36.7
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
19
CLIP-FT (CC-FT)
35.6
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
20
CLIP (CC-FT)
35
No
Coarse-to-Fine Contrastive Learning in Image-Tex...
2023-05-23
-
21
RN50 (CC12M)
34.88
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code
22
Random
20
No
CREPE: Can Vision-Language Foundation Models Rea...
2022-12-13
Code