TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Methodology/2D Object Detection/COCO minival

2D Object Detection on COCO minival

Metric: AP75 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕AP75▼AugmentationsPaperDate↕Code
1Focal-Stable-DINO (Focal-Huge, no TTA)71.4YesA Strong and Reproducible Object Detector with O...2023-04-25Code
2EVA70.8YesEVA: Exploring the Limits of Masked Visual Repre...2022-11-14Code
3UNINEXT-H66.7YesUniversal Instance Perception as Object Discover...2023-03-12Code
4QueryInst (single scale)61.7NoInstances as Queries2021-05-05Code
5SOLQ (Swin-L, single scale)61.3NoSOLQ: Segmenting Objects by Learning Queries2021-06-04Code
6YOLOv4-P7 CSP-P7 (single-scale, 16 fps)60.7NoScaled-YOLOv4: Scaling Cross Stage Partial Network2020-11-16Code
7YOLOR-D6 (1280, single-scale, 31 fps)60.6NoYou Only Learn One Representation: Unified Netwo...2021-05-10Code
8EfficientDet-D7x (single-scale)59NoEfficientDet: Scalable and Efficient Object Dete...2019-11-20Code
9UniverseNet-20.08d (Res2Net-101, DCN, multi-scale)58.9NoUSB: Universal-Scale Object Detection Benchmark2021-03-25Code
10YOLOR-P6 (1280, single-scale, 72 fps)57.4NoYou Only Learn One Representation: Unified Netwo...2021-05-10Code
11ResNeSt-200 (multi-scale)57.07NoResNeSt: Split-Attention Networks2020-04-19Code
12GCNet (ResNeXt-101 + DCN + cascade + GC r4)56.1NoGlobal Context Networks2020-12-24Code
13ELSA-S (Cascade Mask RCNN)56NoELSA: Enhanced Local Self-Attention for Vision T...2021-12-23Code
14FocalNet-T (LRF, Cascade Mask R-CNN)56NoFocal Modulation Networks2022-03-22Code
15DINO-5scale (24 epoch)56NoDINO: DETR with Improved DeNoising Anchor Boxes ...2022-03-07Code
16DINO-5scale (36 epoch)55.8NoDINO: DETR with Improved DeNoising Anchor Boxes ...2022-03-07Code
17FocalNet-T (SRF, Cascade Mask R-CNN)55.8NoFocal Modulation Networks2022-03-22Code
18ResNeSt-200-DCN (single-scale)55.4NoResNeSt: Split-Attention Networks2020-04-19Code
19UniverseNet-20.08d (Res2Net-101, DCN, single-scale)55.4NoUSB: Universal-Scale Object Detection Benchmark2021-03-25Code
20ResNeSt-200 (single-scale)55.17NoResNeSt: Split-Attention Networks2020-04-19Code
21Sparse R-CNN (PVTv2-B2)54.9NoPVT v2: Improved Baselines with Pyramid Vision T...2021-06-25Code
22BoTNet 200 (Mask R-CNN, single scale, 72 epochs)54.6NoBottleneck Transformers for Visual Recognition2021-01-27Code
23BoTNet 152 (Mask R-CNN, single scale, 72 epochs)54.2NoBottleneck Transformers for Visual Recognition2021-01-27Code
24DN-Deformable-DETR-R50++53.8NoDN-DETR: Accelerate DETR Training by Introducing...2022-03-02Code
25REGO-Deformable DETR-X10153.1NoRecurrent Glimpse-based Decoder for Detection wi...2021-12-09Code
26Mask R-CNN (ResNeXt-152-FPN, cascade)52.9NoRethinking ImageNet Pre-training2018-11-21Code
27ELSA-S (Mask RCNN)52.9NoELSA: Enhanced Local Self-Attention for Vision T...2021-12-23Code
28UniverseNet-20.08 (Res2Net-50, DCN, single-scale)52.6NoUSB: Universal-Scale Object Detection Benchmark2021-03-25Code
29GCNet (ResNeXt-101 + DCN + cascade + GC r16)52.2NoGCNet: Non-local Networks Meet Squeeze-Excitatio...2019-04-25Code
30MAE-Det(MAE-Det-L+GFLV2)52.2NoMAE-DET: Revisiting Maximum Entropy Principle in...2021-11-26Code
31Res2Net101+HTC51.3NoRes2Net: A New Multi-scale Backbone Architecture2019-04-02Code
32Mask R-CNN (ResNeXt-152-FPN)51.1NoRethinking ImageNet Pre-training2018-11-21Code
33Cascade Mask R-CNN (ResNet-50)50.5NoDeep Residual Learning for Image Recognition2015-12-10Code
34HoughNet (HG-104, MS)50.3NoHoughNet: Integrating near and long-range eviden...2020-07-05Code
35DAB-DETR-DC5-R10150.2NoDAB-DETR: Dynamic Anchor Boxes are Better Querie...2022-01-28Code
36Conditional DETR-DC5-R10149.5NoConditional DETR for Fast Training Convergence2021-08-13Code
37Sparse R-CNN (ResNet-101, learnable proposals, random crop aug, FPN)49.5NoSparse R-CNN: End-to-End Object Detection with L...2020-11-25Code
38Mask R-CNN-FPN (AOGNet-40M)49.1NoAttentive Normalization2019-08-04Code
39Mask R-CNN (ResNeXt-152 + 1 NL)48.9NoNon-local Neural Networks2017-11-21Code
40R3-CNN (ResNet-50-FPN, DCN)48.9NoRecursively Refined R-CNN: Instance Segmentation...2021-04-03Code
41Anchor DETR-DC5-R10148.8NoAnchor DETR: Query Design for Transformer-Based ...2021-09-15Code
42Cascade R-CNN (HRNetV2p-W48)48.7NoDeep High-Resolution Representation Learning for...2019-08-20Code
43Pix2seq (R101-DC5)48.6NoPix2seq: A Language Modeling Framework for Objec...2021-09-22Code
44Conditional DETR-DC5-R5048.5NoConditional DETR for Fast Training Convergence2021-08-13Code
45R3-CNN (ResNet-50-FPN, GC-Net)48.4NoRecursively Refined R-CNN: Instance Segmentation...2021-04-03Code
46GFL (ResNet-50)48.3NoDeep Residual Learning for Image Recognition2015-12-10Code
47Sparse R-CNN (ResNet-50, learnable proposals, random crop aug, FPN)48.2NoSparse R-CNN: End-to-End Object Detection with L...2020-11-25Code
48Faster RCNN-R101-FPN+47.8NoEnd-to-End Object Detection with Transformers2020-05-26Code
49DETR-DC5 (ResNet-101)47.7NoEnd-to-End Object Detection with Transformers2020-05-26Code
50Cascade R-CNN (HRNetV2p-W32)47.7NoDeep High-Resolution Representation Learning for...2019-08-20Code
51RetinaNet (ViL-Base, multi-scale, 3x)47.6NoMulti-Scale Vision Longformer: A New Vision Tran...2021-03-29Code
52Conditional DETR-R10147.5NoConditional DETR for Fast Training Convergence2021-08-13Code
53Anchor DETR-DC5-R5047.5NoAnchor DETR: Query Design for Transformer-Based ...2021-09-15Code
54DAB-DETR-R10147.2NoDAB-DETR: Dynamic Anchor Boxes are Better Querie...2022-01-28Code
55Sparse R-CNN (ResNet-101, FPN)47.2NoSparse R-CNN: End-to-End Object Detection with L...2020-11-25Code
56Mask R-CNN-FPN (ResNeXt-101, GN+WS)47.11NoMicro-Batch Training with Batch-Channel Normaliz...2019-03-25Code
57RetinaNet (ViL-Base)47.1NoMulti-Scale Vision Longformer: A New Vision Tran...2021-03-29Code
58ATSS (ResNet-50)47NoDeep Residual Learning for Image Recognition2015-12-10Code
59HoughNet (HG-104)46.9NoHoughNet: Integrating near and long-range eviden...2020-07-05Code
60ExtremeNet (Hourglass-104, multi-scale)46.8NoBottom-up Object Detection by Grouping Extreme a...2019-01-23Code
61Cascade R-CNN (ResNet-101-FPN+, cascade)46.6NoCascade R-CNN: Delving into High Quality Object ...2017-12-03Code
62Faster R-CNN (FPN, X-volution)46.4NoX-volution: On the unification of convolution an...2021-06-04-
63R3-CNN (ResNet-50-FPN)46.3NoRecursively Refined R-CNN: Instance Segmentation...2021-04-03Code
64Mask R-CNN (ResNet-101-FPN, GroupNorm, long)46.2NoGroup Normalization2018-03-22Code
65PVT-Large (RetinaNet 3x,MS)46.1NoPyramid Vision Transformer: A Versatile Backbone...2021-02-24Code
66Pix2seq (R50-DC5 )46.1NoPix2seq: A Language Modeling Framework for Objec...2021-09-22Code
67Faster R-CNN (HRNetV2p-W48)45.9NoDeep High-Resolution Representation Learning for...2019-08-20Code
68Conditional DETR-R5045.7NoConditional DETR for Fast Training Convergence2021-08-13Code
69Sparse R-CNN (ResNet-50, FPN)45.7NoSparse R-CNN: End-to-End Object Detection with L...2020-11-25Code
70Faster R-CNN (LIP-ResNet-101)45.6NoLIP: Local Importance-based Pooling2019-08-12Code
71R3-CNN (ResNet-50-FPN, GRoIE)45.6NoRecursively Refined R-CNN: Instance Segmentation...2021-04-03Code
72TridentNet (ResNet-101)45.5NoScale-Aware Trident Networks for Object Detection2019-01-07Code
73PVT-Large (RetinaNet 1x)45.4NoPyramid Vision Transformer: A Versatile Backbone...2021-02-24Code
74Cascade R-CNN (HRNetV2p-W18)44.9NoDeep High-Resolution Representation Learning for...2019-08-20Code
75PoolFormer-S36 (Mask R-CNN)44.8NoMetaFormer Is Actually What You Need for Vision2021-11-22Code
76Faster R-CNN (HRNetV2p-W32)44.8NoDeep High-Resolution Representation Learning for...2019-08-20Code
77Mask R-CNN (ResNet-101 + 1 NL)44.5NoNon-local Neural Networks2017-11-21Code
78Grid R-CNN (ResNet-101-FPN)44.4NoGrid R-CNN2018-11-29Code
79Mask R-CNN (ResNet-50-FPN, GroupNorm, long)44.4NoGroup Normalization2018-03-22Code
80PPDet (ResNet-101-FPN)44.2NoReducing Label Noise in Anchor-Free Object Detec...2020-08-03Code
81RetinaMask (ResNet-101-FPN)44.1NoRetinaMask: Learning to predict masks improves s...2019-01-10Code
82GCnet (ResNet-50-FPN, GRoIE)44NoGCNet: Non-local Networks Meet Squeeze-Excitatio...2019-04-25Code
83Mask R-CNN (ResNet-50-FPN, GroupNorm)44NoGroup Normalization2018-03-22Code
84CenterNet511 (Hourglass-52)43.9NoCenterNet: Keypoint Triplets for Object Detection2019-04-17Code
85Cascade R-CNN (ResNet-50-FPN+)43.7NoCascade R-CNN: Delving into High Quality Object ...2017-12-03Code
86ExtremeNet (Hourglass-104, single-scale)43.7NoBottom-up Object Detection by Grouping Extreme a...2019-01-23Code
87Faster R-CNN+aLRP Loss (ResNet-50, 500 scale)43.3NoA Ranking-based, Balanced Loss Function Unifying...2020-09-28Code
88FPN+43.3NoFeature Pyramid Networks for Object Detection2016-12-09Code
89Grid R-CNN (ResNet-50-FPN)42.4NoGrid R-CNN2018-11-29Code
90RetinaNet+aLRP Loss (ResNet-50, 500 scale)42.3NoA Ranking-based, Balanced Loss Function Unifying...2020-09-28Code
91Libra R-CNN (ResNet-50 FPN)42NoLibra R-CNN: Towards Balanced Learning for Objec...2019-04-04Code
92Mask R-CNN (ResNet-50 + 1 NL)41.9NoNon-local Neural Networks2017-11-21Code
93Mask R-CNN (ResNet-50-FPN, GRoIE)41.7NoA novel Region of Interest Extraction Layer for ...2020-04-28Code
94FoveaBox+aLRP Loss (ResNet-50, 500 scale)41.5NoA Ranking-based, Balanced Loss Function Unifying...2020-09-28Code
95FoveaBox (ResNet-101-FPN, 800x800)41.5NoFoveaBox: Beyond Anchor-based Object Detector2019-04-08Code
96Faster R-CNN (HRNetV2p-W18)41.5NoDeep High-Resolution Representation Learning for...2019-08-20Code
97FCOS (ResNet-50-FPN + improvements)41.4NoFCOS: Fully Convolutional One-Stage Object Detec...2019-04-02Code
98CornerNet511 (Hourglass-104)40.9NoCornerNet: Detecting Objects as Paired Keypoints2018-08-03Code
99HTC (cascade)40.7NoHybrid Task Cascade for Instance Segmentation2019-01-22Code
100Faster R-CNN (ResNet-50-FPN, GRoIE)40.6NoA novel Region of Interest Extraction Layer for ...2020-04-28Code
101FoveaBox+Retina (ResNet-50)40.5NoFoveaBox: Beyond Anchor-based Object Detector2019-04-08Code
102FoveaBox (ResNet-101-FPN, 600x600)40.2NoFoveaBox: Beyond Anchor-based Object Detector2019-04-08Code
103Mask R-CNN (ResNeXt-101-FPN)38.9NoMask R-CNN2017-03-20Code
104GHM-C + GHM-R (RetinaNet-FPN-ResNet-50, M=30)38.1NoGradient Harmonized Single-stage Detector2018-11-13Code
105FoveaBox (ResNet-50-FPN, 600x600)37.9NoFoveaBox: Beyond Anchor-based Object Detector2019-04-08Code
106FSAF (ResNet-50)37.9NoFeature Selective Anchor-Free Module for Single-...2019-03-02Code