Object Detection on COCO minival

Metric: AP50 (higher is better)

LeaderboardDataset

Loading chart...

Results

Hide extra data

Sort:

#	Model↕	AP50▼	Extra Data	Paper	Date↕	Code
1	EVA	82.1	Yes	EVA: Exploring the Limits of Masked Visual Repre...	2022-11-14	Code
2	Focal-Stable-DINO (Focal-Huge, no TTA)	81.5	Yes	A Strong and Reproducible Object Detector with O...	2023-04-25	Code
3	DyHead (Swin-L, multi scale, self-training)	78.2	Yes	Dynamic Head: Unifying Object Detection Heads wi...	2021-06-15	Code
4	UNINEXT-H	77.5	Yes	Universal Instance Perception as Object Discover...	2023-03-12	Code
5	Focal-L (DyHead, multi-scale)	77.2	No	Focal Self-attention for Local-Global Interactio...	2021-07-01	Code
6	DyHead (Swin-L, multi scale)	76.8	No	Dynamic Head: Unifying Object Detection Heads wi...	2021-06-15	Code
7	QueryInst (single scale)	75.8	No	Instances as Queries	2021-05-05	Code
8	SOLQ (Swin-L, single scale)	74.9	No	SOLQ: Segmenting Objects by Learning Queries	2021-06-04	Code
9	YOLOv6-L6(46 fps, 1280, V100)	74.5	No	YOLOv6 v3.0: A Full-Scale Reloading	2023-01-13	Code
10	YOLOR-D6 (1280, single-scale, 31 fps)	73.5	No	You Only Learn One Representation: Unified Netwo...	2021-05-10	Code
11	EfficientDet-D7x (single-scale)	73.4	No	EfficientDet: Scalable and Efficient Object Dete...	2019-11-20	Code
12	YOLOv4-P7 CSP-P7 (single-scale, 16 fps)	73.3	No	Scaled-YOLOv4: Scaling Cross Stage Partial Network	2020-11-16	Code
13	BoTNet 200 (Mask R-CNN, single scale, 72 epochs)	71.3	No	Bottleneck Transformers for Visual Recognition	2021-01-27	Code
14	ResNeSt-200 (multi-scale)	71	No	ResNeSt: Split-Attention Networks	2020-04-19	Code
15	BoTNet 152 (Mask R-CNN, single scale, 72 epochs)	71	No	Bottleneck Transformers for Visual Recognition	2021-01-27	Code
16	UniverseNet-20.08d (Res2Net-101, DCN, multi-scale)	70.8	No	USB: Universal-Scale Object Detection Benchmark	2021-03-25	Code
17	YOLOR-P6 (1280, single-scale, 72 fps)	70.6	No	You Only Learn One Representation: Unified Netwo...	2021-05-10	Code
18	ELSA-S (Cascade Mask RCNN)	70.5	No	ELSA: Enhanced Local Self-Attention for Vision T...	2021-12-23	Code
19	GCNet (ResNeXt-101 + DCN + cascade + GC r4)	70.4	No	Global Context Networks	2020-12-24	Code
20	ELSA-S (Mask RCNN)	70.4	No	ELSA: Enhanced Local Self-Attention for Vision T...	2021-12-23	Code
21	FocalNet-T (LRF, Cascade Mask R-CNN)	70.3	No	Focal Modulation Networks	2022-03-22	Code
22	FocalNet-T (SRF, Cascade Mask R-CNN)	70.1	No	Focal Modulation Networks	2022-03-22	Code
23	ResNeSt-200-DCN (single-scale)	69.53	No	ResNeSt: Split-Attention Networks	2020-04-19	Code
24	UniverseNet-20.08d (Res2Net-101, DCN, single-scale)	69.5	No	USB: Universal-Scale Object Detection Benchmark	2021-03-25	Code
25	Sparse R-CNN (PVTv2-B2)	69.5	No	PVT v2: Improved Baselines with Pyramid Vision T...	2021-06-25	Code
26	DINO-5scale (24 epoch)	69.1	No	DINO: DETR with Improved DeNoising Anchor Boxes ...	2022-03-07	Code
27	DINO-5scale (36 epoch)	69	No	DINO: DETR with Improved DeNoising Anchor Boxes ...	2022-03-07	Code
28	ResNeSt-200 (single-scale)	68.78	No	ResNeSt: Split-Attention Networks	2020-04-19	Code
29	CenterMask+VoVNet99 (multi-scale)	67.8	No	CenterMask : Real-Time Anchor-Free Instance Segm...	2019-11-15	Code
30	Mask R-CNN (ResNeXt-152 + 1 NL)	67.8	No	Non-local Neural Networks	2017-11-21	Code
31	DN-Deformable-DETR-R50++	67.6	No	DN-DETR: Accelerate DETR Training by Introducing...	2022-03-02	Code
32	REGO-Deformable DETR-X101	67.5	No	Recurrent Glimpse-based Decoder for Detection wi...	2021-12-09	Code
33	Mask R-CNN (ResNeXt-152-FPN)	67.1	No	Rethinking ImageNet Pre-training	2018-11-21	Code
34	UniverseNet-20.08 (Res2Net-50, DCN, single-scale)	67	No	USB: Universal-Scale Object Detection Benchmark	2021-03-25	Code
35	DAB-DETR-DC5-R101	67	No	DAB-DETR: Dynamic Anchor Boxes are Better Querie...	2022-01-28	Code
36	GCNet (ResNeXt-101 + DCN + cascade + GC r16)	66.9	No	GCNet: Non-local Networks Meet Squeeze-Excitatio...	2019-04-25	Code
37	Mask R-CNN (ResNeXt-152-FPN, cascade)	66.8	No	Rethinking ImageNet Pre-training	2018-11-21	Code
38	Conditional DETR-DC5-R101	66.8	No	Conditional DETR for Fast Training Convergence	2021-08-13	Code
39	Res2Net101+HTC	66.5	No	Res2Net: A New Multi-scale Backbone Architecture	2019-04-02	Code
40	Mask R-CNN-FPN (AOGNet-40M)	66.2	No	Attentive Normalization	2019-08-04	Code
41	Anchor DETR-DC5-R101	65.7	No	Anchor DETR: Query Design for Transformer-Based ...	2021-09-15	Code
42	Conditional DETR-R101	65.6	No	Conditional DETR for Fast Training Convergence	2021-08-13	Code
43	MAE-Det(MAE-Det-L+GFLV2)	65.5	No	MAE-DET: Revisiting Maximum Entropy Principle in...	2021-11-26	Code
44	RetinaNet (ViL-Base)	65.5	No	Multi-Scale Vision Longformer: A New Vision Tran...	2021-03-29	Code
45	Conditional DETR-DC5-R50	65.4	No	Conditional DETR for Fast Training Convergence	2021-08-13	Code
46	DETR-DC5 (ResNet-101)	64.7	No	End-to-End Object Detection with Transformers	2020-05-26	Code
47	Anchor DETR-DC5-R50	64.7	No	Anchor DETR: Query Design for Transformer-Based ...	2021-09-15	Code
48	DAB-DETR-R101	64.7	No	DAB-DETR: Dynamic Anchor Boxes are Better Querie...	2022-01-28	Code
49	HoughNet (HG-104, MS)	64.6	No	HoughNet: Integrating near and long-range eviden...	2020-07-05	Code
50	Sparse R-CNN (ResNet-101, learnable proposals, random crop aug, FPN)	64.6	No	Sparse R-CNN: End-to-End Object Detection with L...	2020-11-25	Code
51	Cascade Mask R-CNN (ResNet-50)	64.3	No	Deep Residual Learning for Image Recognition	2015-12-10	Code
52	R3-CNN (ResNet-50-FPN, DCN)	64.3	No	Recursively Refined R-CNN: Instance Segmentation...	2021-04-03	Code
53	Mask R-CNN-FPN (ResNeXt-101, GN+WS)	64.15	No	Micro-Batch Training with Batch-Channel Normaliz...	2019-03-25	Code
54	R3-CNN (ResNet-50-FPN, GC-Net)	64.1	No	Recursively Refined R-CNN: Instance Segmentation...	2021-04-03	Code
55	Conditional DETR-R50	64	No	Conditional DETR for Fast Training Convergence	2021-08-13	Code
56	Faster R-CNN (FPN, X-volution)	64	No	X-volution: On the unification of convolution an...	2021-06-04	-
57	Faster RCNN-R101-FPN+	63.9	No	End-to-End Object Detection with Transformers	2020-05-26	Code
58	PVT-Large (RetinaNet 1x)	63.7	No	Pyramid Vision Transformer: A Versatile Backbone...	2021-02-24	Code
59	PVT-Large (RetinaNet 3x,MS)	63.6	No	Pyramid Vision Transformer: A Versatile Backbone...	2021-02-24	Code
60	Faster R-CNN (LIP-ResNet-101)	63.6	No	LIP: Local Importance-based Pooling	2019-08-12	Code
61	TridentNet (ResNet-101)	63.5	No	Scale-Aware Trident Networks for Object Detection	2019-01-07	Code
62	Sparse R-CNN (ResNet-50, learnable proposals, random crop aug, FPN)	63.4	No	Sparse R-CNN: End-to-End Object Detection with L...	2020-11-25	Code
63	Pix2seq (R101-DC5)	63.2	No	Pix2seq: A Language Modeling Framework for Objec...	2021-09-22	Code
64	PoolFormer-S36 (Mask R-CNN)	63.1	No	MetaFormer Is Actually What You Need for Vision	2021-11-22	Code
65	Mask R-CNN (ResNet-101 + 1 NL)	63.1	No	Non-local Neural Networks	2017-11-21	Code
66	GFL (ResNet-50)	63	No	Deep Residual Learning for Image Recognition	2015-12-10	Code
67	Mask R-CNN (ResNet-101-FPN, GroupNorm, long)	62.8	No	Group Normalization	2018-03-22	Code
68	Faster R-CNN (HRNetV2p-W48)	62.8	No	Deep High-Resolution Representation Learning for...	2019-08-20	Code
69	Cascade R-CNN (HRNetV2p-W48)	62.7	No	Deep High-Resolution Representation Learning for...	2019-08-20	Code
70	FSAF (ResNeXt-101, anchor-based branches)	62.4	No	Feature Selective Anchor-Free Module for Single-...	2019-03-02	Code
71	GCnet (ResNet-50-FPN, GRoIE)	62.4	No	GCNet: Non-local Networks Meet Squeeze-Excitatio...	2019-04-25	Code
72	HoughNet (HG-104)	62.2	No	HoughNet: Integrating near and long-range eviden...	2020-07-05	Code
73	Sparse R-CNN (ResNet-101, FPN)	62.1	No	Sparse R-CNN: End-to-End Object Detection with L...	2020-11-25	Code
74	ATSS (ResNet-50)	61.9	No	Deep Residual Learning for Image Recognition	2015-12-10	Code
75	Faster R-CNN (HRNetV2p-W32)	61.8	No	Deep High-Resolution Representation Learning for...	2019-08-20	Code
76	Cascade R-CNN (HRNetV2p-W32)	61.7	No	Deep High-Resolution Representation Learning for...	2019-08-20	Code
77	Cascade R-CNN (ResNet-101-FPN+, cascade)	61.6	No	Cascade R-CNN: Delving into High Quality Object ...	2017-12-03	Code
78	Mask R-CNN (ResNet-50-FPN, GroupNorm, long)	61.6	No	Group Normalization	2018-03-22	Code
79	FPN+	61.3	No	Feature Pyramid Networks for Object Detection	2016-12-09	Code
80	Sparse R-CNN (ResNet-50, FPN)	61.2	No	Sparse R-CNN: End-to-End Object Detection with L...	2020-11-25	Code
81	R3-CNN (ResNet-50-FPN, GRoIE)	61.2	No	Recursively Refined R-CNN: Instance Segmentation...	2021-04-03	Code
82	Mask R-CNN (ResNet-50 + 1 NL)	61.1	No	Non-local Neural Networks	2017-11-21	Code
83	Pix2seq (R50-DC5 )	61	No	Pix2seq: A Language Modeling Framework for Objec...	2021-09-22	Code
84	R3-CNN (ResNet-50-FPN)	61	No	Recursively Refined R-CNN: Instance Segmentation...	2021-04-03	Code
85	Mask R-CNN (ResNet-50-FPN, GroupNorm)	61	No	Group Normalization	2018-03-22	Code
86	Faster R-CNN+aLRP Loss (ResNet-50, 500 scale)	60.7	No	A Ranking-based, Balanced Loss Function Unifying...	2020-09-28	Code
87	Grid R-CNN (ResNet-101-FPN)	60.3	No	Grid R-CNN	2018-11-29	Code
88	RetinaNet+aLRP Loss (ResNet-50, 500 scale)	60.3	No	A Ranking-based, Balanced Loss Function Unifying...	2020-09-28	Code
89	RetinaMask (ResNet-101-FPN)	60.2	No	RetinaMask: Learning to predict masks improves s...	2019-01-10	Code
90	Mask R-CNN (ResNet-50-FPN, GRoIE)	59.9	No	A novel Region of Interest Extraction Layer for ...	2020-04-28	Code
91	ExtremeNet (Hourglass-104, multi-scale)	59.6	No	Bottom-up Object Detection by Grouping Extreme a...	2019-01-23	Code
92	PPDet (ResNet-101-FPN)	59.5	No	Reducing Label Noise in Anchor-Free Object Detec...	2020-08-03	Code
93	Mask R-CNN (ResNeXt-101-FPN)	59.5	No	Mask R-CNN	2017-03-20	Code
94	HTC (cascade)	59.4	No	Hybrid Task Cascade for Instance Segmentation	2019-01-22	Code
95	Cascade R-CNN (ResNet-50-FPN+)	59.4	No	Cascade R-CNN: Delving into High Quality Object ...	2017-12-03	Code
96	Libra R-CNN (ResNet-50 FPN)	59.3	No	Libra R-CNN: Towards Balanced Learning for Objec...	2019-04-04	Code
97	Cascade R-CNN (HRNetV2p-W18)	59.2	No	Deep High-Resolution Representation Learning for...	2019-08-20	Code
98	CenterNet511 (Hourglass-52)	59.2	No	CenterNet: Keypoint Triplets for Object Detection	2019-04-17	Code
99	FSAF (ResNet-101, anchor-based branches)	59.2	No	Feature Selective Anchor-Free Module for Single-...	2019-03-02	Code
100	Faster R-CNN (ResNet-50-FPN, GRoIE)	59.2	No	A novel Region of Interest Extraction Layer for ...	2020-04-28	Code
101	Faster R-CNN (HRNetV2p-W18)	58.9	No	Deep High-Resolution Representation Learning for...	2019-08-20	Code
102	FoveaBox+aLRP Loss (ResNet-50, 500 scale)	58.8	No	A Ranking-based, Balanced Loss Function Unifying...	2020-09-28	Code
103	FoveaBox (ResNet-101-FPN, 800x800)	58.4	No	FoveaBox: Beyond Anchor-based Object Detector	2019-04-08	Code
104	Grid R-CNN (ResNet-50-FPN)	58.3	No	Grid R-CNN	2018-11-29	Code
105	FSAF (ResNet-101)	58	No	Feature Selective Anchor-Free Module for Single-...	2019-03-02	Code
106	FoveaBox+Retina (ResNet-50)	57.8	No	FoveaBox: Beyond Anchor-based Object Detector	2019-04-08	Code
107	FoveaBox (ResNet-101-FPN, 600x600)	57.8	No	FoveaBox: Beyond Anchor-based Object Detector	2019-04-08	Code
108	FCOS (ResNet-50-FPN + improvements)	57.4	No	FCOS: Fully Convolutional One-Stage Object Detec...	2019-04-02	Code
109	GHM-C + GHM-R (RetinaNet-FPN-ResNet-50, M=30)	55.5	No	Gradient Harmonized Single-stage Detector	2018-11-13	Code
110	Online Fg Bal. Sampling+Hard Negative Mining (ResNet-50)	55.3	No	Generating Positive Bounding Boxes for Balanced ...	2019-09-21	Code
111	FoveaBox (ResNet-50-FPN, 600x600)	55.2	No	FoveaBox: Beyond Anchor-based Object Detector	2019-04-08	Code
112	ExtremeNet (Hourglass-104, single-scale)	55.1	No	Bottom-up Object Detection by Grouping Extreme a...	2019-01-23	Code
113	FSAF (ResNet-50)	55	No	Feature Selective Anchor-Free Module for Single-...	2019-03-02	Code
114	CornerNet511 (Hourglass-104)	53.8	No	CornerNet: Detecting Objects as Paired Keypoints	2018-08-03	Code
115	M2Det (ResNet-1o1, 320x320)	53.7	No	M2Det: A Single-Shot Object Detector based on Mu...	2018-11-12	Code
116	Faster R-CNN (Res2Net-50)	53.6	No	Res2Net: A New Multi-scale Backbone Architecture	2019-04-02	Code
117	M2Det (VGG-16, 320x320)	52.2	No	M2Det: A Single-Shot Object Detector based on Mu...	2018-11-12	Code

#1EVASOTA
82.1
AP50· Extra Data· 2022-11-14
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale Code
#2Focal-Stable-DINO (Focal-Huge, no TTA)
81.5
AP50· Extra Data· 2023-04-25
A Strong and Reproducible Object Detector with Only Public Datasets Code
#3DyHead (Swin-L, multi scale, self-training)SOTA
78.2
AP50· Extra Data· 2021-06-15
Dynamic Head: Unifying Object Detection Heads with Attentions Code
#4UNINEXT-H
77.5
AP50· Extra Data· 2023-03-12
Universal Instance Perception as Object Discovery and Retrieval Code
#5Focal-L (DyHead, multi-scale)
77.2
AP50· 2021-07-01
Focal Self-attention for Local-Global Interactions in Vision Transformers Code
#6DyHead (Swin-L, multi scale)
76.8
AP50· 2021-06-15
Dynamic Head: Unifying Object Detection Heads with Attentions Code
#7QueryInst (single scale)SOTA
75.8
AP50· 2021-05-05
Instances as Queries Code
#8SOLQ (Swin-L, single scale)
74.9
AP50· 2021-06-04
SOLQ: Segmenting Objects by Learning Queries Code
#9YOLOv6-L6(46 fps, 1280, V100)
74.5
AP50· 2023-01-13
YOLOv6 v3.0: A Full-Scale Reloading Code
#10YOLOR-D6 (1280, single-scale, 31 fps)
73.5
AP50· 2021-05-10
You Only Learn One Representation: Unified Network for Multiple Tasks Code
#11EfficientDet-D7x (single-scale)SOTA
73.4
AP50· 2019-11-20
EfficientDet: Scalable and Efficient Object Detection Code
#12YOLOv4-P7 CSP-P7 (single-scale, 16 fps)
73.3
AP50· 2020-11-16
Scaled-YOLOv4: Scaling Cross Stage Partial Network Code
#13BoTNet 200 (Mask R-CNN, single scale, 72 epochs)
71.3
AP50· 2021-01-27
Bottleneck Transformers for Visual Recognition Code
#14ResNeSt-200 (multi-scale)
71
AP50· 2020-04-19
ResNeSt: Split-Attention Networks Code
#15BoTNet 152 (Mask R-CNN, single scale, 72 epochs)
71
AP50· 2021-01-27
Bottleneck Transformers for Visual Recognition Code
#16UniverseNet-20.08d (Res2Net-101, DCN, multi-scale)
70.8
AP50· 2021-03-25
USB: Universal-Scale Object Detection Benchmark Code
#17YOLOR-P6 (1280, single-scale, 72 fps)
70.6
AP50· 2021-05-10
You Only Learn One Representation: Unified Network for Multiple Tasks Code
#18ELSA-S (Cascade Mask RCNN)
70.5
AP50· 2021-12-23
ELSA: Enhanced Local Self-Attention for Vision Transformer Code
#19GCNet (ResNeXt-101 + DCN + cascade + GC r4)
70.4
AP50· 2020-12-24
Global Context Networks Code
#20ELSA-S (Mask RCNN)
70.4
AP50· 2021-12-23
ELSA: Enhanced Local Self-Attention for Vision Transformer Code
#21FocalNet-T (LRF, Cascade Mask R-CNN)
70.3
AP50· 2022-03-22
Focal Modulation Networks Code
#22FocalNet-T (SRF, Cascade Mask R-CNN)
70.1
AP50· 2022-03-22
Focal Modulation Networks Code
#23ResNeSt-200-DCN (single-scale)
69.53
AP50· 2020-04-19
ResNeSt: Split-Attention Networks Code
#24UniverseNet-20.08d (Res2Net-101, DCN, single-scale)
69.5
AP50· 2021-03-25
USB: Universal-Scale Object Detection Benchmark Code
#25Sparse R-CNN (PVTv2-B2)
69.5
AP50· 2021-06-25
PVT v2: Improved Baselines with Pyramid Vision Transformer Code
#26DINO-5scale (24 epoch)
69.1
AP50· 2022-03-07
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection Code
#27DINO-5scale (36 epoch)
69
AP50· 2022-03-07
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection Code
#28ResNeSt-200 (single-scale)
68.78
AP50· 2020-04-19
ResNeSt: Split-Attention Networks Code
#29CenterMask+VoVNet99 (multi-scale)
67.8
AP50· 2019-11-15
CenterMask : Real-Time Anchor-Free Instance Segmentation Code
#30Mask R-CNN (ResNeXt-152 + 1 NL)SOTA
67.8
AP50· 2017-11-21
Non-local Neural Networks Code
#31DN-Deformable-DETR-R50++
67.6
AP50· 2022-03-02
DN-DETR: Accelerate DETR Training by Introducing Query DeNoising Code
#32REGO-Deformable DETR-X101
67.5
AP50· 2021-12-09
Recurrent Glimpse-based Decoder for Detection with Transformer Code
#33Mask R-CNN (ResNeXt-152-FPN)
67.1
AP50· 2018-11-21
Rethinking ImageNet Pre-training Code
#34UniverseNet-20.08 (Res2Net-50, DCN, single-scale)
67
AP50· 2021-03-25
USB: Universal-Scale Object Detection Benchmark Code
#35DAB-DETR-DC5-R101
67
AP50· 2022-01-28
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR Code
#36GCNet (ResNeXt-101 + DCN + cascade + GC r16)
66.9
AP50· 2019-04-25
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond Code
#37Mask R-CNN (ResNeXt-152-FPN, cascade)
66.8
AP50· 2018-11-21
Rethinking ImageNet Pre-training Code
#38Conditional DETR-DC5-R101
66.8
AP50· 2021-08-13
Conditional DETR for Fast Training Convergence Code
#39Res2Net101+HTC
66.5
AP50· 2019-04-02
Res2Net: A New Multi-scale Backbone Architecture Code
#40Mask R-CNN-FPN (AOGNet-40M)
66.2
AP50· 2019-08-04
Attentive Normalization Code
#41Anchor DETR-DC5-R101
65.7
AP50· 2021-09-15
Anchor DETR: Query Design for Transformer-Based Object Detection Code
#42Conditional DETR-R101
65.6
AP50· 2021-08-13
Conditional DETR for Fast Training Convergence Code
#43MAE-Det(MAE-Det-L+GFLV2)
65.5
AP50· 2021-11-26
MAE-DET: Revisiting Maximum Entropy Principle in Zero-Shot NAS for Efficient Object Detection Code
#44RetinaNet (ViL-Base)
65.5
AP50· 2021-03-29
Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding Code
#45Conditional DETR-DC5-R50
65.4
AP50· 2021-08-13
Conditional DETR for Fast Training Convergence Code
#46DETR-DC5 (ResNet-101)
64.7
AP50· 2020-05-26
End-to-End Object Detection with Transformers Code
#47Anchor DETR-DC5-R50
64.7
AP50· 2021-09-15
Anchor DETR: Query Design for Transformer-Based Object Detection Code
#48DAB-DETR-R101
64.7
AP50· 2022-01-28
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR Code
#49HoughNet (HG-104, MS)
64.6
AP50· 2020-07-05
HoughNet: Integrating near and long-range evidence for bottom-up object detection Code
#50Sparse R-CNN (ResNet-101, learnable proposals, random crop aug, FPN)
64.6
AP50· 2020-11-25
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals Code
#51Cascade Mask R-CNN (ResNet-50)SOTA
64.3
AP50· 2015-12-10
Deep Residual Learning for Image Recognition Code
#52R3-CNN (ResNet-50-FPN, DCN)
64.3
AP50· 2021-04-03
Recursively Refined R-CNN: Instance Segmentation with Self-RoI Rebalancing Code
#53Mask R-CNN-FPN (ResNeXt-101, GN+WS)
64.15
AP50· 2019-03-25
Micro-Batch Training with Batch-Channel Normalization and Weight Standardization Code
#54R3-CNN (ResNet-50-FPN, GC-Net)
64.1
AP50· 2021-04-03
Recursively Refined R-CNN: Instance Segmentation with Self-RoI Rebalancing Code
#55Conditional DETR-R50
64
AP50· 2021-08-13
Conditional DETR for Fast Training Convergence Code
#56Faster R-CNN (FPN, X-volution)
64
AP50· 2021-06-04
X-volution: On the unification of convolution and self-attention
#57Faster RCNN-R101-FPN+
63.9
AP50· 2020-05-26
End-to-End Object Detection with Transformers Code
#58PVT-Large (RetinaNet 1x)
63.7
AP50· 2021-02-24
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions Code
#59PVT-Large (RetinaNet 3x,MS)
63.6
AP50· 2021-02-24
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions Code
#60Faster R-CNN (LIP-ResNet-101)
63.6
AP50· 2019-08-12
LIP: Local Importance-based Pooling Code
#61TridentNet (ResNet-101)
63.5
AP50· 2019-01-07
Scale-Aware Trident Networks for Object Detection Code
#62Sparse R-CNN (ResNet-50, learnable proposals, random crop aug, FPN)
63.4
AP50· 2020-11-25
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals Code
#63Pix2seq (R101-DC5)
63.2
AP50· 2021-09-22
Pix2seq: A Language Modeling Framework for Object Detection Code
#64PoolFormer-S36 (Mask R-CNN)
63.1
AP50· 2021-11-22
MetaFormer Is Actually What You Need for Vision Code
#65Mask R-CNN (ResNet-101 + 1 NL)
63.1
AP50· 2017-11-21
Non-local Neural Networks Code
#66GFL (ResNet-50)
63
AP50· 2015-12-10
Deep Residual Learning for Image Recognition Code
#67Mask R-CNN (ResNet-101-FPN, GroupNorm, long)
62.8
AP50· 2018-03-22
Group Normalization Code
#68Faster R-CNN (HRNetV2p-W48)
62.8
AP50· 2019-08-20
Deep High-Resolution Representation Learning for Visual Recognition Code
#69Cascade R-CNN (HRNetV2p-W48)
62.7
AP50· 2019-08-20
Deep High-Resolution Representation Learning for Visual Recognition Code
#70FSAF (ResNeXt-101, anchor-based branches)
62.4
AP50· 2019-03-02
Feature Selective Anchor-Free Module for Single-Shot Object Detection Code
#71GCnet (ResNet-50-FPN, GRoIE)
62.4
AP50· 2019-04-25
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond Code
#72HoughNet (HG-104)
62.2
AP50· 2020-07-05
HoughNet: Integrating near and long-range evidence for bottom-up object detection Code
#73Sparse R-CNN (ResNet-101, FPN)
62.1
AP50· 2020-11-25
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals Code
#74ATSS (ResNet-50)
61.9
AP50· 2015-12-10
Deep Residual Learning for Image Recognition Code
#75Faster R-CNN (HRNetV2p-W32)
61.8
AP50· 2019-08-20
Deep High-Resolution Representation Learning for Visual Recognition Code
#76Cascade R-CNN (HRNetV2p-W32)
61.7
AP50· 2019-08-20
Deep High-Resolution Representation Learning for Visual Recognition Code
#77Cascade R-CNN (ResNet-101-FPN+, cascade)
61.6
AP50· 2017-12-03
Cascade R-CNN: Delving into High Quality Object Detection Code
#78Mask R-CNN (ResNet-50-FPN, GroupNorm, long)
61.6
AP50· 2018-03-22
Group Normalization Code
#79FPN+
61.3
AP50· 2016-12-09
Feature Pyramid Networks for Object Detection Code
#80Sparse R-CNN (ResNet-50, FPN)
61.2
AP50· 2020-11-25
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals Code
#81R3-CNN (ResNet-50-FPN, GRoIE)
61.2
AP50· 2021-04-03
Recursively Refined R-CNN: Instance Segmentation with Self-RoI Rebalancing Code
#82Mask R-CNN (ResNet-50 + 1 NL)
61.1
AP50· 2017-11-21
Non-local Neural Networks Code
#83Pix2seq (R50-DC5 )
61
AP50· 2021-09-22
Pix2seq: A Language Modeling Framework for Object Detection Code
#84R3-CNN (ResNet-50-FPN)
61
AP50· 2021-04-03
Recursively Refined R-CNN: Instance Segmentation with Self-RoI Rebalancing Code
#85Mask R-CNN (ResNet-50-FPN, GroupNorm)
61
AP50· 2018-03-22
Group Normalization Code
#86Faster R-CNN+aLRP Loss (ResNet-50, 500 scale)
60.7
AP50· 2020-09-28
A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection Code
#87Grid R-CNN (ResNet-101-FPN)
60.3
AP50· 2018-11-29
Grid R-CNN Code
#88RetinaNet+aLRP Loss (ResNet-50, 500 scale)
60.3
AP50· 2020-09-28
A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection Code
#89RetinaMask (ResNet-101-FPN)
60.2
AP50· 2019-01-10
RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free Code
#90Mask R-CNN (ResNet-50-FPN, GRoIE)
59.9
AP50· 2020-04-28
A novel Region of Interest Extraction Layer for Instance Segmentation Code
#91ExtremeNet (Hourglass-104, multi-scale)
59.6
AP50· 2019-01-23
Bottom-up Object Detection by Grouping Extreme and Center Points Code
#92PPDet (ResNet-101-FPN)
59.5
AP50· 2020-08-03
Reducing Label Noise in Anchor-Free Object Detection Code
#93Mask R-CNN (ResNeXt-101-FPN)
59.5
AP50· 2017-03-20
Mask R-CNN Code
#94HTC (cascade)
59.4
AP50· 2019-01-22
Hybrid Task Cascade for Instance Segmentation Code
#95Cascade R-CNN (ResNet-50-FPN+)
59.4
AP50· 2017-12-03
Cascade R-CNN: Delving into High Quality Object Detection Code
#96Libra R-CNN (ResNet-50 FPN)
59.3
AP50· 2019-04-04
Libra R-CNN: Towards Balanced Learning for Object Detection Code
#97Cascade R-CNN (HRNetV2p-W18)
59.2
AP50· 2019-08-20
Deep High-Resolution Representation Learning for Visual Recognition Code
#98CenterNet511 (Hourglass-52)
59.2
AP50· 2019-04-17
CenterNet: Keypoint Triplets for Object Detection Code
#99FSAF (ResNet-101, anchor-based branches)
59.2
AP50· 2019-03-02
Feature Selective Anchor-Free Module for Single-Shot Object Detection Code
#100Faster R-CNN (ResNet-50-FPN, GRoIE)
59.2
AP50· 2020-04-28
A novel Region of Interest Extraction Layer for Instance Segmentation Code
#101Faster R-CNN (HRNetV2p-W18)
58.9
AP50· 2019-08-20
Deep High-Resolution Representation Learning for Visual Recognition Code
#102FoveaBox+aLRP Loss (ResNet-50, 500 scale)
58.8
AP50· 2020-09-28
A Ranking-based, Balanced Loss Function Unifying Classification and Localisation in Object Detection Code
#103FoveaBox (ResNet-101-FPN, 800x800)
58.4
AP50· 2019-04-08
FoveaBox: Beyond Anchor-based Object Detector Code
#104Grid R-CNN (ResNet-50-FPN)
58.3
AP50· 2018-11-29
Grid R-CNN Code
#105FSAF (ResNet-101)
58
AP50· 2019-03-02
Feature Selective Anchor-Free Module for Single-Shot Object Detection Code
#106FoveaBox+Retina (ResNet-50)
57.8
AP50· 2019-04-08
FoveaBox: Beyond Anchor-based Object Detector Code
#107FoveaBox (ResNet-101-FPN, 600x600)
57.8
AP50· 2019-04-08
FoveaBox: Beyond Anchor-based Object Detector Code
#108FCOS (ResNet-50-FPN + improvements)
57.4
AP50· 2019-04-02
FCOS: Fully Convolutional One-Stage Object Detection Code
#109GHM-C + GHM-R (RetinaNet-FPN-ResNet-50, M=30)
55.5
AP50· 2018-11-13
Gradient Harmonized Single-stage Detector Code
#110Online Fg Bal. Sampling+Hard Negative Mining (ResNet-50)
55.3
AP50· 2019-09-21
Generating Positive Bounding Boxes for Balanced Training of Object Detectors Code
#111FoveaBox (ResNet-50-FPN, 600x600)
55.2
AP50· 2019-04-08
FoveaBox: Beyond Anchor-based Object Detector Code
#112ExtremeNet (Hourglass-104, single-scale)
55.1
AP50· 2019-01-23
Bottom-up Object Detection by Grouping Extreme and Center Points Code
#113FSAF (ResNet-50)
55
AP50· 2019-03-02
Feature Selective Anchor-Free Module for Single-Shot Object Detection Code
#114CornerNet511 (Hourglass-104)
53.8
AP50· 2018-08-03
CornerNet: Detecting Objects as Paired Keypoints Code
#115M2Det (ResNet-1o1, 320x320)
53.7
AP50· 2018-11-12
M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network Code
#116Faster R-CNN (Res2Net-50)
53.6
AP50· 2019-04-02
Res2Net: A New Multi-scale Backbone Architecture Code
#117M2Det (VGG-16, 320x320)
52.2
AP50· 2018-11-12
M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network Code