Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Object Detection
/
COCO-O
Object Detection on COCO-O
Metric: Average mAP (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
Average mAP (best first)
Average mAP (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Average mAP
▼
Extra Data
Paper
Date
↕
Code
1
EVA
57.8
No
EVA: Exploring the Limits of Masked Visual Repre...
2022-11-14
Code
2
DETA (Swin-L)
48.5
No
NMS Strikes Back
2022-12-12
Code
3
GLIP-L (Swin-L)
48
No
Grounded Language-Image Pre-training
2021-12-07
Code
4
GRiT (ViT-H)
42.9
No
GRiT: A Generative Region-to-text Transformer fo...
2022-12-01
Code
5
DINO (Swin-L)
42.1
No
DINO: DETR with Improved DeNoising Anchor Boxes ...
2022-03-07
Code
6
CBNetV2 (Swin-L)
39
No
CBNet: A Composite Backbone Network Architecture...
2021-07-01
Code
7
ConvNeXt-XL (Cascade Mask R-CNN)
37.5
No
A ConvNet for the 2020s
2022-01-10
Code
8
InternImage-L (Cascade Mask R-CNN)
37
No
InternImage: Exploring Large-Scale Vision Founda...
2022-11-10
Code
9
DyHead (Swin-L)
35.3
No
Dynamic Head: Unifying Object Detection Heads wi...
2021-06-15
Code
10
ViTDet (ViT-H)
34.3
No
Exploring Plain Vision Transformer Backbones for...
2022-03-30
Code
11
ViT-Adapter (BEiTv2-L)
34.25
No
Vision Transformer Adapter for Dense Predictions
2022-05-17
Code
12
FIBER-B (Swin-B)
33.7
No
Coarse-to-Fine Vision-Language Pre-training with...
2022-06-15
Code
13
QueryInst (Swin-L)
33.2
No
Instances as Queries
2021-05-05
Code
14
YOLOv6-L6
32.5
No
YOLOv6: A Single-Stage Object Detection Framewor...
2022-09-07
Code
15
YOLOv7-E6E
32
No
YOLOv7: Trainable bag-of-freebies sets new state...
2022-07-06
Code
16
MViTV2-H (Cascade Mask R-CNN)
30.9
No
MViTv2: Improved Multiscale Vision Transformers ...
2021-12-02
Code
17
Det-AdvProp (EfficientNet-B5)
30.8
No
Robust and Accurate Object Detection via Adversa...
2021-03-23
Code
18
YOLOv4-P6
30.4
No
YOLOv4: Optimal Speed and Accuracy of Object Det...
2020-04-23
Code
19
YOLOX-X
30.3
No
YOLOX: Exceeding YOLO Series in 2021
2021-07-18
Code
20
CenterNet2 (R2-101-DCN)
29.5
No
Probabilistic two-stage detection
2021-03-12
Code
21
GLIP-T (Swin-T)
29.1
No
Grounded Language-Image Pre-training
2021-12-07
Code
22
EfficientDet-D5 (EfficientNet-B5)
28.5
No
EfficientDet: Scalable and Efficient Object Dete...
2019-11-20
Code
23
PVTv2-B5 (Mask R-CNN)
28.2
No
PVT v2: Improved Baselines with Pyramid Vision T...
2021-06-25
Code
24
VFNet (RX-101-64x4d)
28
No
VarifocalNet: An IoU-aware Dense Object Detector
2020-08-31
Code
25
GCNet (RX-101-32x4d-DCN)
26
No
GCNet: Non-local Networks Meet Squeeze-Excitatio...
2019-04-25
Code
26
GFLv2 (R2-101-DCN)
25.1
No
Generalized Focal Loss V2: Learning Reliable Loc...
2020-11-25
Code
27
RepPointsV2 (RX-101-64x4d-DCN)
24.9
No
RepPoints V2: Verification Meets Regression for ...
2020-07-16
Code
28
UniverseNet (R2-101-DCN)
24.8
No
USB: Universal-Scale Object Detection Benchmark
2021-03-25
Code
29
YOLOX-S
20.6
No
YOLOX: Exceeding YOLO Series in 2021
2021-07-18
Code
30
YOLOS-B (ViT-B)
20
No
You Only Look at One Sequence: Rethinking Transf...
2021-06-01
Code
31
DyHead (ResNet-50)
19.3
No
Dynamic Head: Unifying Object Detection Heads wi...
2021-06-15
Code
32
HTC (ResNet-50)
19.1
No
Hybrid Task Cascade for Instance Segmentation
2019-01-22
Code
33
Deformable-DETR (ResNet-50)
18.5
No
Deformable DETR: Deformable Transformers for End...
2020-10-08
Code
34
Cascade R-CNN (ResNet-50)
18.2
No
Cascade R-CNN: High Quality Object Detection and...
2019-06-24
Code
35
Mask R-CNN (ResNet-50)
17.1
No
Mask R-CNN
2017-03-20
Code
36
DETR (ResNet-50)
17.1
No
End-to-End Object Detection with Transformers
2020-05-26
Code
37
ATSS (ResNet-50)
16.8
No
Bridging the Gap Between Anchor-based and Anchor...
2019-12-05
Code
38
FCOS (ResNet-50)
16.7
No
FCOS: Fully Convolutional One-Stage Object Detec...
2019-04-02
Code
39
RetinaNet (ResNet-50)
16.6
No
Focal Loss for Dense Object Detection
2017-08-07
Code
40
Faster R-CNN (ResNet-50-FPN)
16.4
Yes
Faster R-CNN: Towards Real-Time Object Detection...
2015-06-04
Code
41
YOLOv3 (DarkNet-53)
14.8
No
YOLOv3: An Incremental Improvement
2018-04-08
Code
42
SSD (VGG-16)
13.6
No
SSD: Single Shot MultiBox Detector
2015-12-08
Code
#1
EVA
SOTA
57.8
Average mAP
· 2022-11-14
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Code
#2
DETA (Swin-L)
48.5
Average mAP
· 2022-12-12
NMS Strikes Back
Code
#3
GLIP-L (Swin-L)
SOTA
48
Average mAP
· 2021-12-07
Grounded Language-Image Pre-training
Code
#4
GRiT (ViT-H)
42.9
Average mAP
· 2022-12-01
GRiT: A Generative Region-to-text Transformer for Object Understanding
Code
#5
DINO (Swin-L)
42.1
Average mAP
· 2022-03-07
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Code
#6
CBNetV2 (Swin-L)
SOTA
39
Average mAP
· 2021-07-01
CBNet: A Composite Backbone Network Architecture for Object Detection
Code
#7
ConvNeXt-XL (Cascade Mask R-CNN)
37.5
Average mAP
· 2022-01-10
A ConvNet for the 2020s
Code
#8
InternImage-L (Cascade Mask R-CNN)
37
Average mAP
· 2022-11-10
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Code
#9
DyHead (Swin-L)
SOTA
35.3
Average mAP
· 2021-06-15
Dynamic Head: Unifying Object Detection Heads with Attentions
Code
#10
ViTDet (ViT-H)
34.3
Average mAP
· 2022-03-30
Exploring Plain Vision Transformer Backbones for Object Detection
Code
#11
ViT-Adapter (BEiTv2-L)
34.25
Average mAP
· 2022-05-17
Vision Transformer Adapter for Dense Predictions
Code
#12
FIBER-B (Swin-B)
33.7
Average mAP
· 2022-06-15
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
Code
#13
QueryInst (Swin-L)
SOTA
33.2
Average mAP
· 2021-05-05
Instances as Queries
Code
#14
YOLOv6-L6
32.5
Average mAP
· 2022-09-07
YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications
Code
#15
YOLOv7-E6E
32
Average mAP
· 2022-07-06
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Code
#16
MViTV2-H (Cascade Mask R-CNN)
30.9
Average mAP
· 2021-12-02
MViTv2: Improved Multiscale Vision Transformers for Classification and Detection
Code
#17
Det-AdvProp (EfficientNet-B5)
SOTA
30.8
Average mAP
· 2021-03-23
Robust and Accurate Object Detection via Adversarial Learning
Code
#18
YOLOv4-P6
SOTA
30.4
Average mAP
· 2020-04-23
YOLOv4: Optimal Speed and Accuracy of Object Detection
Code
#19
YOLOX-X
30.3
Average mAP
· 2021-07-18
YOLOX: Exceeding YOLO Series in 2021
Code
#20
CenterNet2 (R2-101-DCN)
29.5
Average mAP
· 2021-03-12
Probabilistic two-stage detection
Code
#21
GLIP-T (Swin-T)
29.1
Average mAP
· 2021-12-07
Grounded Language-Image Pre-training
Code
#22
EfficientDet-D5 (EfficientNet-B5)
SOTA
28.5
Average mAP
· 2019-11-20
EfficientDet: Scalable and Efficient Object Detection
Code
#23
PVTv2-B5 (Mask R-CNN)
28.2
Average mAP
· 2021-06-25
PVT v2: Improved Baselines with Pyramid Vision Transformer
Code
#24
VFNet (RX-101-64x4d)
28
Average mAP
· 2020-08-31
VarifocalNet: An IoU-aware Dense Object Detector
Code
#25
GCNet (RX-101-32x4d-DCN)
SOTA
26
Average mAP
· 2019-04-25
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
Code
#26
GFLv2 (R2-101-DCN)
25.1
Average mAP
· 2020-11-25
Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection
Code
#27
RepPointsV2 (RX-101-64x4d-DCN)
24.9
Average mAP
· 2020-07-16
RepPoints V2: Verification Meets Regression for Object Detection
Code
#28
UniverseNet (R2-101-DCN)
24.8
Average mAP
· 2021-03-25
USB: Universal-Scale Object Detection Benchmark
Code
#29
YOLOX-S
20.6
Average mAP
· 2021-07-18
YOLOX: Exceeding YOLO Series in 2021
Code
#30
YOLOS-B (ViT-B)
20
Average mAP
· 2021-06-01
You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection
Code
#31
DyHead (ResNet-50)
19.3
Average mAP
· 2021-06-15
Dynamic Head: Unifying Object Detection Heads with Attentions
Code
#32
HTC (ResNet-50)
SOTA
19.1
Average mAP
· 2019-01-22
Hybrid Task Cascade for Instance Segmentation
Code
#33
Deformable-DETR (ResNet-50)
18.5
Average mAP
· 2020-10-08
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Code
#34
Cascade R-CNN (ResNet-50)
18.2
Average mAP
· 2019-06-24
Cascade R-CNN: High Quality Object Detection and Instance Segmentation
Code
#35
Mask R-CNN (ResNet-50)
SOTA
17.1
Average mAP
· 2017-03-20
Mask R-CNN
Code
#36
DETR (ResNet-50)
17.1
Average mAP
· 2020-05-26
End-to-End Object Detection with Transformers
Code
#37
ATSS (ResNet-50)
16.8
Average mAP
· 2019-12-05
Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection
Code
#38
FCOS (ResNet-50)
16.7
Average mAP
· 2019-04-02
FCOS: Fully Convolutional One-Stage Object Detection
Code
#39
RetinaNet (ResNet-50)
16.6
Average mAP
· 2017-08-07
Focal Loss for Dense Object Detection
Code
#40
Faster R-CNN (ResNet-50-FPN)
SOTA
16.4
Average mAP
· Extra Data
· 2015-06-04
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Code
#41
YOLOv3 (DarkNet-53)
14.8
Average mAP
· 2018-04-08
YOLOv3: An Incremental Improvement
Code
#42
SSD (VGG-16)
13.6
Average mAP
· 2015-12-08
SSD: Single Shot MultiBox Detector
Code