TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Methodology/3D/COCO minival

3D on COCO minival

Metric: AP50 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕AP50▼AugmentationsPaperDate↕Code
1EVA82.1YesEVA: Exploring the Limits of Masked Visual Repre...2022-11-14Code
2Focal-Stable-DINO (Focal-Huge, no TTA)81.5YesA Strong and Reproducible Object Detector with O...2023-04-25Code
3DyHead (Swin-L, multi scale, self-training)78.2YesDynamic Head: Unifying Object Detection Heads wi...2021-06-15Code
4UNINEXT-H77.5YesUniversal Instance Perception as Object Discover...2023-03-12Code
5Focal-L (DyHead, multi-scale)77.2NoFocal Self-attention for Local-Global Interactio...2021-07-01Code
6DyHead (Swin-L, multi scale)76.8NoDynamic Head: Unifying Object Detection Heads wi...2021-06-15Code
7QueryInst (single scale)75.8NoInstances as Queries2021-05-05Code
8SOLQ (Swin-L, single scale)74.9NoSOLQ: Segmenting Objects by Learning Queries2021-06-04Code
9YOLOv6-L6(46 fps, 1280, V100)74.5NoYOLOv6 v3.0: A Full-Scale Reloading2023-01-13Code
10YOLOR-D6 (1280, single-scale, 31 fps)73.5NoYou Only Learn One Representation: Unified Netwo...2021-05-10Code
11EfficientDet-D7x (single-scale)73.4NoEfficientDet: Scalable and Efficient Object Dete...2019-11-20Code
12YOLOv4-P7 CSP-P7 (single-scale, 16 fps)73.3NoScaled-YOLOv4: Scaling Cross Stage Partial Network2020-11-16Code
13BoTNet 200 (Mask R-CNN, single scale, 72 epochs)71.3NoBottleneck Transformers for Visual Recognition2021-01-27Code
14ResNeSt-200 (multi-scale)71NoResNeSt: Split-Attention Networks2020-04-19Code
15BoTNet 152 (Mask R-CNN, single scale, 72 epochs)71NoBottleneck Transformers for Visual Recognition2021-01-27Code
16UniverseNet-20.08d (Res2Net-101, DCN, multi-scale)70.8NoUSB: Universal-Scale Object Detection Benchmark2021-03-25Code
17YOLOR-P6 (1280, single-scale, 72 fps)70.6NoYou Only Learn One Representation: Unified Netwo...2021-05-10Code
18ELSA-S (Cascade Mask RCNN)70.5NoELSA: Enhanced Local Self-Attention for Vision T...2021-12-23Code
19GCNet (ResNeXt-101 + DCN + cascade + GC r4)70.4NoGlobal Context Networks2020-12-24Code
20ELSA-S (Mask RCNN)70.4NoELSA: Enhanced Local Self-Attention for Vision T...2021-12-23Code
21FocalNet-T (LRF, Cascade Mask R-CNN)70.3NoFocal Modulation Networks2022-03-22Code
22FocalNet-T (SRF, Cascade Mask R-CNN)70.1NoFocal Modulation Networks2022-03-22Code
23ResNeSt-200-DCN (single-scale)69.53NoResNeSt: Split-Attention Networks2020-04-19Code
24UniverseNet-20.08d (Res2Net-101, DCN, single-scale)69.5NoUSB: Universal-Scale Object Detection Benchmark2021-03-25Code
25Sparse R-CNN (PVTv2-B2)69.5NoPVT v2: Improved Baselines with Pyramid Vision T...2021-06-25Code
26DINO-5scale (24 epoch)69.1NoDINO: DETR with Improved DeNoising Anchor Boxes ...2022-03-07Code
27DINO-5scale (36 epoch)69NoDINO: DETR with Improved DeNoising Anchor Boxes ...2022-03-07Code
28ResNeSt-200 (single-scale)68.78NoResNeSt: Split-Attention Networks2020-04-19Code
29CenterMask+VoVNet99 (multi-scale)67.8NoCenterMask : Real-Time Anchor-Free Instance Segm...2019-11-15Code
30Mask R-CNN (ResNeXt-152 + 1 NL)67.8NoNon-local Neural Networks2017-11-21Code
31DN-Deformable-DETR-R50++67.6NoDN-DETR: Accelerate DETR Training by Introducing...2022-03-02Code
32REGO-Deformable DETR-X10167.5NoRecurrent Glimpse-based Decoder for Detection wi...2021-12-09Code
33Mask R-CNN (ResNeXt-152-FPN)67.1NoRethinking ImageNet Pre-training2018-11-21Code
34UniverseNet-20.08 (Res2Net-50, DCN, single-scale)67NoUSB: Universal-Scale Object Detection Benchmark2021-03-25Code
35DAB-DETR-DC5-R10167NoDAB-DETR: Dynamic Anchor Boxes are Better Querie...2022-01-28Code
36GCNet (ResNeXt-101 + DCN + cascade + GC r16)66.9NoGCNet: Non-local Networks Meet Squeeze-Excitatio...2019-04-25Code
37Mask R-CNN (ResNeXt-152-FPN, cascade)66.8NoRethinking ImageNet Pre-training2018-11-21Code
38Conditional DETR-DC5-R10166.8NoConditional DETR for Fast Training Convergence2021-08-13Code
39Res2Net101+HTC66.5NoRes2Net: A New Multi-scale Backbone Architecture2019-04-02Code
40Mask R-CNN-FPN (AOGNet-40M)66.2NoAttentive Normalization2019-08-04Code
41Anchor DETR-DC5-R10165.7NoAnchor DETR: Query Design for Transformer-Based ...2021-09-15Code
42Conditional DETR-R10165.6NoConditional DETR for Fast Training Convergence2021-08-13Code
43MAE-Det(MAE-Det-L+GFLV2)65.5NoMAE-DET: Revisiting Maximum Entropy Principle in...2021-11-26Code
44RetinaNet (ViL-Base)65.5NoMulti-Scale Vision Longformer: A New Vision Tran...2021-03-29Code
45Conditional DETR-DC5-R5065.4NoConditional DETR for Fast Training Convergence2021-08-13Code
46DETR-DC5 (ResNet-101)64.7NoEnd-to-End Object Detection with Transformers2020-05-26Code
47Anchor DETR-DC5-R5064.7NoAnchor DETR: Query Design for Transformer-Based ...2021-09-15Code
48DAB-DETR-R10164.7NoDAB-DETR: Dynamic Anchor Boxes are Better Querie...2022-01-28Code
49HoughNet (HG-104, MS)64.6NoHoughNet: Integrating near and long-range eviden...2020-07-05Code
50Sparse R-CNN (ResNet-101, learnable proposals, random crop aug, FPN)64.6NoSparse R-CNN: End-to-End Object Detection with L...2020-11-25Code
51Cascade Mask R-CNN (ResNet-50)64.3NoDeep Residual Learning for Image Recognition2015-12-10Code
52R3-CNN (ResNet-50-FPN, DCN)64.3NoRecursively Refined R-CNN: Instance Segmentation...2021-04-03Code
53Mask R-CNN-FPN (ResNeXt-101, GN+WS)64.15NoMicro-Batch Training with Batch-Channel Normaliz...2019-03-25Code
54R3-CNN (ResNet-50-FPN, GC-Net)64.1NoRecursively Refined R-CNN: Instance Segmentation...2021-04-03Code
55Conditional DETR-R5064NoConditional DETR for Fast Training Convergence2021-08-13Code
56Faster R-CNN (FPN, X-volution)64NoX-volution: On the unification of convolution an...2021-06-04-
57Faster RCNN-R101-FPN+63.9NoEnd-to-End Object Detection with Transformers2020-05-26Code
58PVT-Large (RetinaNet 1x)63.7NoPyramid Vision Transformer: A Versatile Backbone...2021-02-24Code
59PVT-Large (RetinaNet 3x,MS)63.6NoPyramid Vision Transformer: A Versatile Backbone...2021-02-24Code
60Faster R-CNN (LIP-ResNet-101)63.6NoLIP: Local Importance-based Pooling2019-08-12Code
61TridentNet (ResNet-101)63.5NoScale-Aware Trident Networks for Object Detection2019-01-07Code
62Sparse R-CNN (ResNet-50, learnable proposals, random crop aug, FPN)63.4NoSparse R-CNN: End-to-End Object Detection with L...2020-11-25Code
63Pix2seq (R101-DC5)63.2NoPix2seq: A Language Modeling Framework for Objec...2021-09-22Code
64PoolFormer-S36 (Mask R-CNN)63.1NoMetaFormer Is Actually What You Need for Vision2021-11-22Code
65Mask R-CNN (ResNet-101 + 1 NL)63.1NoNon-local Neural Networks2017-11-21Code
66GFL (ResNet-50)63NoDeep Residual Learning for Image Recognition2015-12-10Code
67Mask R-CNN (ResNet-101-FPN, GroupNorm, long)62.8NoGroup Normalization2018-03-22Code
68Faster R-CNN (HRNetV2p-W48)62.8NoDeep High-Resolution Representation Learning for...2019-08-20Code
69Cascade R-CNN (HRNetV2p-W48)62.7NoDeep High-Resolution Representation Learning for...2019-08-20Code
70FSAF (ResNeXt-101, anchor-based branches)62.4NoFeature Selective Anchor-Free Module for Single-...2019-03-02Code
71GCnet (ResNet-50-FPN, GRoIE)62.4NoGCNet: Non-local Networks Meet Squeeze-Excitatio...2019-04-25Code
72HoughNet (HG-104)62.2NoHoughNet: Integrating near and long-range eviden...2020-07-05Code
73Sparse R-CNN (ResNet-101, FPN)62.1NoSparse R-CNN: End-to-End Object Detection with L...2020-11-25Code
74ATSS (ResNet-50)61.9NoDeep Residual Learning for Image Recognition2015-12-10Code
75Faster R-CNN (HRNetV2p-W32)61.8NoDeep High-Resolution Representation Learning for...2019-08-20Code
76Cascade R-CNN (HRNetV2p-W32)61.7NoDeep High-Resolution Representation Learning for...2019-08-20Code
77Cascade R-CNN (ResNet-101-FPN+, cascade)61.6NoCascade R-CNN: Delving into High Quality Object ...2017-12-03Code
78Mask R-CNN (ResNet-50-FPN, GroupNorm, long)61.6NoGroup Normalization2018-03-22Code
79FPN+61.3NoFeature Pyramid Networks for Object Detection2016-12-09Code
80Sparse R-CNN (ResNet-50, FPN)61.2NoSparse R-CNN: End-to-End Object Detection with L...2020-11-25Code
81R3-CNN (ResNet-50-FPN, GRoIE)61.2NoRecursively Refined R-CNN: Instance Segmentation...2021-04-03Code
82Mask R-CNN (ResNet-50 + 1 NL)61.1NoNon-local Neural Networks2017-11-21Code
83Pix2seq (R50-DC5 )61NoPix2seq: A Language Modeling Framework for Objec...2021-09-22Code
84R3-CNN (ResNet-50-FPN)61NoRecursively Refined R-CNN: Instance Segmentation...2021-04-03Code
85Mask R-CNN (ResNet-50-FPN, GroupNorm)61NoGroup Normalization2018-03-22Code
86Faster R-CNN+aLRP Loss (ResNet-50, 500 scale)60.7NoA Ranking-based, Balanced Loss Function Unifying...2020-09-28Code
87Grid R-CNN (ResNet-101-FPN)60.3NoGrid R-CNN2018-11-29Code
88RetinaNet+aLRP Loss (ResNet-50, 500 scale)60.3NoA Ranking-based, Balanced Loss Function Unifying...2020-09-28Code
89RetinaMask (ResNet-101-FPN)60.2NoRetinaMask: Learning to predict masks improves s...2019-01-10Code
90Mask R-CNN (ResNet-50-FPN, GRoIE)59.9NoA novel Region of Interest Extraction Layer for ...2020-04-28Code
91ExtremeNet (Hourglass-104, multi-scale)59.6NoBottom-up Object Detection by Grouping Extreme a...2019-01-23Code
92PPDet (ResNet-101-FPN)59.5NoReducing Label Noise in Anchor-Free Object Detec...2020-08-03Code
93Mask R-CNN (ResNeXt-101-FPN)59.5NoMask R-CNN2017-03-20Code
94HTC (cascade)59.4NoHybrid Task Cascade for Instance Segmentation2019-01-22Code
95Cascade R-CNN (ResNet-50-FPN+)59.4NoCascade R-CNN: Delving into High Quality Object ...2017-12-03Code
96Libra R-CNN (ResNet-50 FPN)59.3NoLibra R-CNN: Towards Balanced Learning for Objec...2019-04-04Code
97Cascade R-CNN (HRNetV2p-W18)59.2NoDeep High-Resolution Representation Learning for...2019-08-20Code
98CenterNet511 (Hourglass-52)59.2NoCenterNet: Keypoint Triplets for Object Detection2019-04-17Code
99FSAF (ResNet-101, anchor-based branches)59.2NoFeature Selective Anchor-Free Module for Single-...2019-03-02Code
100Faster R-CNN (ResNet-50-FPN, GRoIE)59.2NoA novel Region of Interest Extraction Layer for ...2020-04-28Code
101Faster R-CNN (HRNetV2p-W18)58.9NoDeep High-Resolution Representation Learning for...2019-08-20Code
102FoveaBox+aLRP Loss (ResNet-50, 500 scale)58.8NoA Ranking-based, Balanced Loss Function Unifying...2020-09-28Code
103FoveaBox (ResNet-101-FPN, 800x800)58.4NoFoveaBox: Beyond Anchor-based Object Detector2019-04-08Code
104Grid R-CNN (ResNet-50-FPN)58.3NoGrid R-CNN2018-11-29Code
105FSAF (ResNet-101)58NoFeature Selective Anchor-Free Module for Single-...2019-03-02Code
106FoveaBox+Retina (ResNet-50)57.8NoFoveaBox: Beyond Anchor-based Object Detector2019-04-08Code
107FoveaBox (ResNet-101-FPN, 600x600)57.8NoFoveaBox: Beyond Anchor-based Object Detector2019-04-08Code
108FCOS (ResNet-50-FPN + improvements)57.4NoFCOS: Fully Convolutional One-Stage Object Detec...2019-04-02Code
109GHM-C + GHM-R (RetinaNet-FPN-ResNet-50, M=30)55.5NoGradient Harmonized Single-stage Detector2018-11-13Code
110Online Fg Bal. Sampling+Hard Negative Mining (ResNet-50)55.3NoGenerating Positive Bounding Boxes for Balanced ...2019-09-21Code
111FoveaBox (ResNet-50-FPN, 600x600)55.2NoFoveaBox: Beyond Anchor-based Object Detector2019-04-08Code
112ExtremeNet (Hourglass-104, single-scale)55.1NoBottom-up Object Detection by Grouping Extreme a...2019-01-23Code
113FSAF (ResNet-50)55NoFeature Selective Anchor-Free Module for Single-...2019-03-02Code
114CornerNet511 (Hourglass-104)53.8NoCornerNet: Detecting Objects as Paired Keypoints2018-08-03Code
115M2Det (ResNet-1o1, 320x320)53.7NoM2Det: A Single-Shot Object Detector based on Mu...2018-11-12Code
116Faster R-CNN (Res2Net-50)53.6NoRes2Net: A New Multi-scale Backbone Architecture2019-04-02Code
117M2Det (VGG-16, 320x320)52.2NoM2Det: A Single-Shot Object Detector based on Mu...2018-11-12Code