Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Open Vocabulary Object Detection
/
MSCOCO
Open Vocabulary Object Detection on MSCOCO
Metric: AP 0.5 (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
AP 0.5 (best first)
AP 0.5 (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
AP 0.5
▼
Extra Data
Paper
Date
↕
Code
1
Cooperative Foundational Models
50.3
No
Enhancing Novel Object Detection via Cooperative...
2023-11-19
Code
2
DE-ViT
50
No
Detect Everything with Few Examples
2023-09-22
Code
3
Yolov8-nano
47.2
Yes
YOLOv8-Based Visual Detection of Road Hazards: P...
2023-10-31
-
4
DITO
46.1
No
Region-centric Image-Language Pretraining for Op...
2023-09-29
Code
5
OV-DQUO(RN50x4)
45.6
No
OV-DQUO: Open-Vocabulary DETR with Denoising Tex...
2024-05-28
Code
6
LP-OVOD (OWL-ViT Proposals)
44.9
No
LP-OVOD: Open-Vocabulary Object Detection by Lin...
2023-10-26
Code
7
CLIPSelf
44.3
No
CLIPSelf: Vision Transformer Distills Itself for...
2023-10-02
Code
8
CORA+
43.1
No
CORA: Adapting CLIP for Open-Vocabulary Detectio...
2023-03-23
Code
9
BARON
42.7
No
Aligning Bag of Regions for Open-Vocabulary Obje...
2023-02-27
Code
10
SIA-OVD (RN50x4)
41.9
No
SIA-OVD: Shape-Invariant Adapter for Bridging th...
2024-10-08
Code
11
CORA
41.7
No
CORA: Adapting CLIP for Open-Vocabulary Detectio...
2023-03-23
Code
12
RALF
41.3
No
Retrieval-Augmented Open-Vocabulary Object Detec...
2024-04-08
Code
13
LP-OVOD
40.5
No
LP-OVOD: Open-Vocabulary Object Detection by Lin...
2023-10-26
Code
14
Region-CLIP (RN50x4-C4)
39.3
Yes
RegionCLIP: Region-based Language-Image Pretrain...
2021-12-16
Code
15
OV-DQUO(R50)
39.2
No
OV-DQUO: Open-Vocabulary DETR with Denoising Tex...
2024-05-28
Code
16
Object-Centric-OVD
36.9
No
Bridging the Gap between Object and Image-level ...
2022-07-07
Code
17
CLIM (RN50)
36.9
No
CLIM: Contrastive Language-Image Mosaic for Regi...
2023-12-18
Code
18
OADP (G-OVD)
35.6
No
Object-Aware Distillation Pyramid for Open-Vocab...
2023-03-10
Code
19
SIA-OVD (RN50)
35.5
No
SIA-OVD: Shape-Invariant Adapter for Bridging th...
2024-10-08
Code
20
VL-PLM (RN50)
34.4
No
Exploiting Unlabeled Data with Vision and Langua...
2022-07-18
Code
21
CFM-ViT
34.1
No
Contrastive Feature Masking Open-Vocabulary Visi...
2023-09-02
-
22
MEDet (RN50)
32.6
No
Open Vocabulary Object Detection with Proposal M...
2022-06-22
Code
23
Region-CLIP (RN50-C4)
31.4
Yes
RegionCLIP: Region-based Language-Image Pretrain...
2021-12-16
Code
24
OVAD-Baseline
30
No
Open-vocabulary Attribute Detection
2022-11-23
Code
25
OADP
30
No
Object-Aware Distillation Pyramid for Open-Vocab...
2023-03-10
Code
26
OV-DERT
29.4
No
Open-Vocabulary DETR with Conditional Matching
2022-03-22
Code
27
LocOv (RN50-C4)
28.6
No
Localized Vision-Language Matching for Open-voca...
2022-05-12
Code
28
Detic
27.8
No
Detecting Twenty-thousand Classes using Image-le...
2022-01-07
Code
29
ViLD
27.6
Yes
Open-vocabulary Object Detection via Vision and ...
2021-04-28
Code
30
OVR-CNN
22.8
No
Open-Vocabulary Object Detection Using Captions
2020-11-20
Code
31
HierKD
20.3
No
Open-Vocabulary One-Stage Detection with Hierarc...
2022-03-20
Code
32
Yolov8
0.5
No
YOLOv8-AM: YOLOv8 Based on Effective Attention M...
2024-02-14
Code
#1
Cooperative Foundational Models
SOTA
50.3
AP 0.5
· 2023-11-19
Enhancing Novel Object Detection via Cooperative Foundational Models
Code
#2
DE-ViT
SOTA
50
AP 0.5
· 2023-09-22
Detect Everything with Few Examples
Code
#3
Yolov8-nano
47.2
AP 0.5
· Extra Data
· 2023-10-31
YOLOv8-Based Visual Detection of Road Hazards: Potholes, Sewer Covers, and Manholes
#4
DITO
46.1
AP 0.5
· 2023-09-29
Region-centric Image-Language Pretraining for Open-Vocabulary Detection
Code
#5
OV-DQUO(RN50x4)
45.6
AP 0.5
· 2024-05-28
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
Code
#6
LP-OVOD (OWL-ViT Proposals)
44.9
AP 0.5
· 2023-10-26
LP-OVOD: Open-Vocabulary Object Detection by Linear Probing
Code
#7
CLIPSelf
44.3
AP 0.5
· 2023-10-02
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
Code
#8
CORA+
SOTA
43.1
AP 0.5
· 2023-03-23
CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching
Code
#9
BARON
SOTA
42.7
AP 0.5
· 2023-02-27
Aligning Bag of Regions for Open-Vocabulary Object Detection
Code
#10
SIA-OVD (RN50x4)
41.9
AP 0.5
· 2024-10-08
SIA-OVD: Shape-Invariant Adapter for Bridging the Image-Region Gap in Open-Vocabulary Detection
Code
#11
CORA
41.7
AP 0.5
· 2023-03-23
CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching
Code
#12
RALF
41.3
AP 0.5
· 2024-04-08
Retrieval-Augmented Open-Vocabulary Object Detection
Code
#13
LP-OVOD
40.5
AP 0.5
· 2023-10-26
LP-OVOD: Open-Vocabulary Object Detection by Linear Probing
Code
#14
Region-CLIP (RN50x4-C4)
SOTA
39.3
AP 0.5
· Extra Data
· 2021-12-16
RegionCLIP: Region-based Language-Image Pretraining
Code
#15
OV-DQUO(R50)
39.2
AP 0.5
· 2024-05-28
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
Code
#16
Object-Centric-OVD
36.9
AP 0.5
· 2022-07-07
Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection
Code
#17
CLIM (RN50)
36.9
AP 0.5
· 2023-12-18
CLIM: Contrastive Language-Image Mosaic for Region Representation
Code
#18
OADP (G-OVD)
35.6
AP 0.5
· 2023-03-10
Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
Code
#19
SIA-OVD (RN50)
35.5
AP 0.5
· 2024-10-08
SIA-OVD: Shape-Invariant Adapter for Bridging the Image-Region Gap in Open-Vocabulary Detection
Code
#20
VL-PLM (RN50)
34.4
AP 0.5
· 2022-07-18
Exploiting Unlabeled Data with Vision and Language Models for Object Detection
Code
#21
CFM-ViT
34.1
AP 0.5
· 2023-09-02
Contrastive Feature Masking Open-Vocabulary Vision Transformer
#22
MEDet (RN50)
32.6
AP 0.5
· 2022-06-22
Open Vocabulary Object Detection with Proposal Mining and Prediction Equalization
Code
#23
Region-CLIP (RN50-C4)
31.4
AP 0.5
· Extra Data
· 2021-12-16
RegionCLIP: Region-based Language-Image Pretraining
Code
#24
OVAD-Baseline
30
AP 0.5
· 2022-11-23
Open-vocabulary Attribute Detection
Code
#25
OADP
30
AP 0.5
· 2023-03-10
Object-Aware Distillation Pyramid for Open-Vocabulary Object Detection
Code
#26
OV-DERT
29.4
AP 0.5
· 2022-03-22
Open-Vocabulary DETR with Conditional Matching
Code
#27
LocOv (RN50-C4)
28.6
AP 0.5
· 2022-05-12
Localized Vision-Language Matching for Open-vocabulary Object Detection
Code
#28
Detic
27.8
AP 0.5
· 2022-01-07
Detecting Twenty-thousand Classes using Image-level Supervision
Code
#29
ViLD
SOTA
27.6
AP 0.5
· Extra Data
· 2021-04-28
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Code
#30
OVR-CNN
SOTA
22.8
AP 0.5
· 2020-11-20
Open-Vocabulary Object Detection Using Captions
Code
#31
HierKD
20.3
AP 0.5
· 2022-03-20
Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation
Code
#32
Yolov8
0.5
AP 0.5
· 2024-02-14
YOLOv8-AM: YOLOv8 Based on Effective Attention Mechanisms for Pediatric Wrist Fracture Detection
Code