| 1 | Co-DETR | 66 | No | DETRs with Collaborative Hybrid Assignments Trai... | 2022-11-22 | Code |
| 2 | InternImage-H (M3I Pre-training) | 65.5 | No | InternImage: Exploring Large-Scale Vision Founda... | 2022-11-10 | Code |
| 3 | M3I Pre-training (InternImage-H) | 65.4 | No | Towards All-in-one Pre-training via Maximizing M... | 2022-11-17 | Code |
| 4 | MoCaE | 65.1 | No | MoCaE: Mixture of Calibrated Experts Significant... | 2023-09-26 | Code |
| 5 | Focal-Stable-DINO (Focal-Huge, no TTA) | 64.8 | No | A Strong and Reproducible Object Detector with O... | 2023-04-25 | Code |
| 6 | Co-DETR (Swin-L) | 64.8 | No | DETRs with Collaborative Hybrid Assignments Trai... | 2022-11-22 | Code |
| 7 | EVA | 64.7 | No | EVA: Exploring the Limits of Masked Visual Repre... | 2022-11-14 | Code |
| 8 | Group DETR v2 | 64.5 | No | Group DETR v2: Strong Object Detector with Encod... | 2022-11-07 | - |
| 9 | FocalNet-H (DINO) | 64.4 | No | Focal Modulation Networks | 2022-03-22 | Code |
| 10 | InternImage-XL | 64.3 | No | InternImage: Exploring Large-Scale Vision Founda... | 2022-11-10 | Code |
| 11 | FD-SwinV2-G | 64.2 | No | Contrastive Learning Rivals Masked Image Modelin... | 2022-05-27 | Code |
| 12 | Plain-DETR (Swin-L) | 63.9 | No | - | - | Code |
| 13 | RevCol-H(DINO) | 63.8 | No | Reversible Column Networks | 2022-12-22 | Code |
| 14 | BEiT-3 | 63.7 | No | Image as a Foreign Language: BEiT Pretraining fo... | 2022-08-22 | Code |
| 15 | Relation-DETR (Focal-L) | 63.5 | No | Relation DETR: Exploring Explicit Position Relat... | 2024-07-16 | Code |
| 16 | DETA (Swin-L) | 63.5 | No | NMS Strikes Back | 2022-12-12 | Code |
| 17 | DINO (Swin-L,multi-scale, TTA) | 63.3 | No | DINO: DETR with Improved DeNoising Anchor Boxes ... | 2022-03-07 | Code |
| 18 | SwinV2-G (HTC++) | 63.1 | No | Swin Transformer V2: Scaling Up Capacity and Res... | 2021-11-18 | Code |
| 19 | Grounding DINO | 63 | No | Grounding DINO: Marrying DINO with Grounded Pre-... | 2023-03-09 | Code |
| 20 | Florence-CoSwin-H | 62.4 | No | Florence: A New Foundation Model for Computer Vi... | 2021-11-22 | Code |
| 21 | GLIPv2 (CoSwin-H, multi-scale) | 62.4 | No | GLIPv2: Unifying Localization and Vision-Languag... | 2022-06-12 | Code |
| 22 | GLEE-Pro | 62.3 | No | General Object Foundation Model for Images and V... | 2023-12-14 | Code |
| 23 | GLIP (Swin-L, multi-scale) | 61.5 | No | Grounded Language-Image Pre-training | 2021-12-07 | Code |
| 24 | Soft Teacher + Swin-L (HTC++, multi-scale) | 61.3 | No | End-to-End Semi-Supervised Object Detection with... | 2021-06-16 | Code |
| 25 | ViT-Adapter-L (HTC++, BEiTv2 pretrain, multi-scale) | 60.9 | No | Vision Transformer Adapter for Dense Predictions | 2022-05-17 | Code |
| 26 | DyHead (Swin-L, multi scale, self-training) | 60.6 | No | Dynamic Head: Unifying Object Detection Heads wi... | 2021-06-15 | Code |
| 27 | GLEE-Plus | 60.6 | No | General Object Foundation Model for Images and V... | 2023-12-14 | Code |
| 28 | ViT-Adapter-L (HTC++, BEiT pretrain, multi-scale) | 60.4 | No | Vision Transformer Adapter for Dense Predictions | 2022-05-17 | Code |
| 29 | GRiT (ViT-H, single-scale testing) | 60.4 | No | GRiT: A Generative Region-to-text Transformer fo... | 2022-12-01 | Code |
| 30 | CBNetV2 (Dual-Swin-L HTC, multi-scale) | 60.1 | No | CBNet: A Composite Backbone Network Architecture... | 2021-07-01 | Code |
| 31 | PIIP-H6B (DINO) | 60 | No | Parameter-Inverted Image Pyramid Networks | 2024-06-06 | Code |
| 32 | CBNetV2 (Dual-Swin-L HTC, single-scale) | 59.4 | No | CBNet: A Composite Backbone Network Architecture... | 2021-07-01 | Code |
| 33 | Focal-L (DyHead, multi-scale) | 58.9 | No | Focal Self-attention for Local-Global Interactio... | 2021-07-01 | Code |
| 34 | DyHead (Swin-L, multi scale) | 58.7 | No | Dynamic Head: Unifying Object Detection Heads wi... | 2021-06-15 | Code |
| 35 | Swin-L (HTC++, multi scale) | 58.7 | No | Swin Transformer: Hierarchical Vision Transforme... | 2021-03-25 | Code |
| 36 | Swin-L (HTC++, single scale) | 57.7 | No | Swin Transformer: Hierarchical Vision Transforme... | 2021-03-25 | Code |
| 37 | Cascade Eff-B7 NAS-FPN (1280, self-training Copy Paste, single-scale) | 57.3 | No | Simple Copy-Paste is a Strong Data Augmentation ... | 2020-12-13 | Code |
| 38 | PyCenterNet (Swin-L, multi-scale) | 57.1 | No | CenterNet++ for Object Detection | 2022-04-18 | Code |
| 39 | dBOT ViT-L (CLIP) | 56.8 | No | Exploring Target Representations for Masked Auto... | 2022-09-08 | Code |
| 40 | YOLOv7-D6 (44 fps) | 56.6 | Yes | YOLOv7: Trainable bag-of-freebies sets new state... | 2022-07-06 | Code |
| 41 | SOLQ (Swin-L, single scale) | 56.5 | No | SOLQ: Segmenting Objects by Learning Queries | 2021-06-04 | Code |
| 42 | CenterNet2 (Res2Net-101-DCN-BiFPN, self-training, 1560 single-scale) | 56.4 | No | Probabilistic two-stage detection | 2021-03-12 | Code |
| 43 | ISTR (ResNet50-FPN-3x, single-scale) | 56.4 | No | ISTR: End-to-End Instance Segmentation with Tran... | 2021-05-03 | Code |
| 44 | QueryInst (single-scale) | 56.1 | No | Instances as Queries | 2021-05-05 | Code |
| 45 | dBOT ViT-L | 56.1 | No | Exploring Target Representations for Masked Auto... | 2022-09-08 | Code |
| 46 | YOLOv7-E6 (56 fps) | 56 | No | YOLOv7: Trainable bag-of-freebies sets new state... | 2022-07-06 | Code |
| 47 | YOLOv4-P7 with TTA | 55.8 | No | Scaled-YOLOv4: Scaling Cross Stage Partial Network | 2020-11-16 | Code |
| 48 | DetectoRS (ResNeXt-101-64x4d, multi-scale) | 55.7 | No | DetectoRS: Detecting Objects with Recursive Feat... | 2020-06-03 | Code |
| 49 | YOLOR-D6 (1280, single-scale, 30 fps) | 55.4 | No | You Only Learn One Representation: Unified Netwo... | 2021-05-10 | Code |
| 50 | YOLOv4-P6 with TTA | 54.9 | No | Scaled-YOLOv4: Scaling Cross Stage Partial Network | 2020-11-16 | Code |
| 51 | YOLOv7-W6 (84 fps) | 54.9 | No | YOLOv7: Trainable bag-of-freebies sets new state... | 2022-07-06 | Code |
| 52 | Cascade Eff-B7 NAS-FPN (1280) | 54.8 | No | Simple Copy-Paste is a Strong Data Augmentation ... | 2020-12-13 | Code |
| 53 | DetectoRS (ResNeXt-101-32x4d, multi-scale) | 54.7 | No | DetectoRS: Detecting Objects with Recursive Feat... | 2020-06-03 | Code |
| 54 | GLEE-Lite | 54.7 | No | General Object Foundation Model for Images and V... | 2023-12-14 | Code |
| 55 | YOLOv4-P6 CSP-P6 (single-scale, 32 fps) | 54.3 | No | Scaled-YOLOv4: Scaling Cross Stage Partial Network | 2020-11-16 | Code |
| 56 | SpineNet-190 (1280, with Self-training on OpenImages, single-scale) | 54.3 | No | Rethinking Pre-training and Self-training | 2020-06-11 | Code |
| 57 | UniverseNet-20.08d (Res2Net-101, DCN, multi-scale) | 54.1 | No | USB: Universal-Scale Object Detection Benchmark | 2021-03-25 | Code |
| 58 | DyHead (ResNeXt-64x4d-101-DCN, multi scale) | 54 | No | Dynamic Head: Unifying Object Detection Heads wi... | 2021-06-15 | Code |
| 59 | dBOT ViT-B (CLIP) | 53.6 | No | Exploring Target Representations for Masked Auto... | 2022-09-08 | Code |
| 60 | PAA (ResNext-152-32x8d + DCN, multi-scale) | 53.5 | No | Probabilistic Anchor Assignment with IoU Predict... | 2020-07-16 | Code |
| 61 | LSNet (Res2Net-101+ DCN, multi-scale) | 53.5 | No | Location-Sensitive Visual Recognition with Cross... | 2021-04-11 | Code |
| 62 | dBOT ViT-B | 53.5 | No | Exploring Target Representations for Masked Auto... | 2022-09-08 | Code |
| 63 | ResNeSt-200 (multi-scale) | 53.3 | No | ResNeSt: Split-Attention Networks | 2020-04-19 | Code |
| 64 | Cascade Mask R-CNN (Triple-ResNeXt152, multi-scale) | 53.3 | No | CBNet: A Novel Composite Backbone Network Archit... | 2019-09-09 | Code |
| 65 | DetectoRS (ResNeXt-101-32x4d, single-scale) | 53.3 | No | DetectoRS: Detecting Objects with Recursive Feat... | 2020-06-03 | Code |
| 66 | GFLV2 (Res2Net-101, DCN, multiscale) | 53.3 | No | Generalized Focal Loss V2: Learning Reliable Loc... | 2020-11-25 | Code |
| 67 | YOLOv7-X (114 fps) | 53.1 | Yes | YOLOv7: Trainable bag-of-freebies sets new state... | 2022-07-06 | Code |
| 68 | RelationNet++ (ResNeXt-64x4d-101-DCN) | 52.7 | No | RelationNet++: Bridging Visual Representations f... | 2020-10-29 | Code |
| 69 | EfficientDet-D7 (1536) | 52.6 | Yes | EfficientDet: Scalable and Efficient Object Dete... | 2019-11-20 | Code |
| 70 | YOLOv4-P5 with TTA | 52.5 | No | Scaled-YOLOv4: Scaling Cross Stage Partial Network | 2020-11-16 | Code |
| 71 | Deformable DETR (ResNeXt-101+DCN) | 52.3 | No | Deformable DETR: Deformable Transformers for End... | 2020-10-08 | Code |
| 72 | GCNet (ResNeXt-101 + DCN + cascade + GC r4) | 52.3 | No | Global Context Networks | 2020-12-24 | Code |
| 73 | PP-YOLOE-x(CSPRepResNet-x, 640x640, single-scale ) | 52.2 | No | PP-YOLOE: An evolved version of YOLO | 2022-03-30 | Code |
| 74 | RetinaNet (SpineNet-190, 1280x1280) | 52.1 | No | SpineNet: Learning Scale-Permuted Backbone for R... | 2019-12-10 | Code |
| 75 | RepPoints v2 (ResNeXt-101, DCN, multi-scale) | 52.1 | No | RepPoints V2: Verification Meets Regression for ... | 2020-07-16 | Code |
| 76 | AC-FPN Cascade R-CNN (X-152-32x8d-FPN-IN5k, multi scale, only CEM) | 51.9 | No | Attention-guided Context Feature Pyramid Network... | 2020-05-23 | Code |
| 77 | OTA (ResNeXt-101+DCN, multiscale) | 51.5 | No | OTA: Optimal Transport Assignment for Object Det... | 2021-03-26 | Code |
| 78 | YOLOX-x(Modified CSP v5, 640x640, single-scale) | 51.5 | Yes | YOLOX: Exceeding YOLO Series in 2021 | 2021-07-18 | Code |
| 79 | PP-YOLOE-l(CSPRepResNet-l, 640x640, single-scale ) | 51.4 | No | PP-YOLOE: An evolved version of YOLO | 2022-03-30 | Code |
| 80 | YOLOv7 (161 fps) | 51.4 | Yes | YOLOv7: Trainable bag-of-freebies sets new state... | 2022-07-06 | Code |
| 81 | UniverseNet-20.08d (Res2Net-101, DCN, single-scale) | 51.3 | No | USB: Universal-Scale Object Detection Benchmark | 2021-03-25 | Code |
| 82 | TSD(SENet154-DCN,multi-scale) | 51.2 | No | Revisiting the Sibling Head in Object Detector | 2020-03-17 | Code |
| 83 | YOLOX-X (Modified CSP v5) | 51.2 | No | YOLOX: Exceeding YOLO Series in 2021 | 2021-07-18 | Code |
| 84 | iBOT (ViT-B/16) | 51.2 | No | iBOT: Image BERT Pre-Training with Online Tokeni... | 2021-11-15 | Code |
| 85 | RetinaNet (SpineNet-143, 1280x1280) | 50.7 | No | SpineNet: Learning Scale-Permuted Backbone for R... | 2019-12-10 | Code |
| 86 | ATSS (ResNetXt-64x4d-101+DCN,multi-scale) | 50.7 | No | Bridging the Gap Between Anchor-based and Anchor... | 2019-12-05 | Code |
| 87 | NAS-FPN (AmoebaNet-D, learned aug) | 50.7 | No | Learning Data Augmentation Strategies for Object... | 2019-06-26 | Code |
| 88 | Boosting R-CNN* | 50.7 | No | Boosting R-CNN: Reweighting R-CNN Samples by RPN... | 2022-06-28 | Code |
| 89 | GFLV2 (Res2Net-101, DCN) | 50.6 | No | Generalized Focal Loss V2: Learning Reliable Loc... | 2020-11-25 | Code |
| 90 | aLRP Loss (ResNext-101-64x4d, DCN, multiscale test) | 50.2 | No | A Ranking-based, Balanced Loss Function Unifying... | 2020-09-28 | Code |
| 91 | FreeAnchor + SEPC (DCN, ResNext-101-64x4d) | 50.1 | No | Scale-Equalizing Pyramid Convolution for Object ... | 2020-05-06 | Code |
| 92 | D2Det (ResNet-101-DCN, multi-scale test) | 50.1 | No | - | - | Code |
| 93 | Dynamic R-CNN (ResNet-101-DCN, multi-scale) | 50.1 | No | Dynamic R-CNN: Towards High Quality Object Detec... | 2020-04-13 | Code |
| 94 | TSD(ResNet-101-Deformable, Image Pyramid) | 49.4 | No | Revisiting the Sibling Head in Object Detector | 2020-03-17 | Code |
| 95 | RepPoints v2 (ResNeXt-101, DCN) | 49.4 | No | RepPoints V2: Verification Meets Regression for ... | 2020-07-16 | Code |
| 96 | A2MIM (ViT-B) | 49.4 | No | Architecture-Agnostic Masked Image Modeling -- F... | 2022-05-27 | Code |
| 97 | iBOT (ViT-S/16) | 49.4 | No | iBOT: Image BERT Pre-Training with Online Tokeni... | 2021-11-15 | Code |
| 98 | CPNDet (Hourglass-104, multi-scale) | 49.2 | No | Corner Proposal Network for Anchor-free, Two-sta... | 2020-07-27 | Code |
| 99 | GFLV2 (ResNeXt-101, 32x4d, DCN) | 49 | No | Generalized Focal Loss V2: Learning Reliable Loc... | 2020-11-25 | Code |
| 100 | aLRP Loss (ResNext-101-64x4d, DCN, single scale) | 48.9 | No | A Ranking-based, Balanced Loss Function Unifying... | 2020-09-28 | Code |
| 101 | PP-YOLOE-m(CSPRepResNet-m, 640x640, single-scale ) | 48.9 | No | PP-YOLOE: An evolved version of YOLO | 2022-03-30 | Code |
| 102 | UniverseNet-20.08 (Res2Net-50, DCN, single-scale) | 48.8 | No | USB: Universal-Scale Object Detection Benchmark | 2021-03-25 | Code |
| 103 | SOLQ (ResNet101, single scale) | 48.7 | No | SOLQ: Segmenting Objects by Learning Queries | 2021-06-04 | Code |
| 104 | RetinaNet (SpineNet-96, 1024x1024) | 48.6 | No | SpineNet: Learning Scale-Permuted Backbone for R... | 2019-12-10 | Code |
| 105 | TridentNet (ResNet-101-Deformable, Image Pyramid) | 48.4 | No | Scale-Aware Trident Networks for Object Detection | 2019-01-07 | Code |
| 106 | GCNet (ResNeXt-101 + DCN + cascade + GC r4) | 48.4 | No | GCNet: Non-local Networks Meet Squeeze-Excitatio... | 2019-04-25 | Code |
| 107 | GFLV2 (ResNet-101-DCN) | 48.3 | No | Generalized Focal Loss V2: Learning Reliable Loc... | 2020-11-25 | Code |
| 108 | Swin-S (RPE w/ GAB) | 48.23 | No | Understanding Gaussian Attention Bias of Vision ... | 2023-05-08 | Code |
| 109 | GFL (X-101-32x4d-DCN, single-scale) | 48.2 | No | Generalized Focal Loss: Learning Qualified and D... | 2020-06-08 | Code |
| 110 | ISTR (ResNet101-FPN-3x, single-scale) | 48.1 | No | ISTR: End-to-End Instance Segmentation with Tran... | 2021-05-03 | Code |
| 111 | YOLOX-Darknet53(Darknet53, 640x640, single-scale) | 48 | Yes | YOLOX: Exceeding YOLO Series in 2021 | 2021-07-18 | Code |
| 112 | DAT-S (RetinaNet) | 47.9 | No | Vision Transformer with Deformable Attention | 2022-01-03 | Code |
| 113 | aLRP Loss (ResNext-101-64x4d, single scale) | 47.8 | No | A Ranking-based, Balanced Loss Function Unifying... | 2020-09-28 | Code |
| 114 | MatrixNet Corners (ResNet-152, multi-scale) | 47.8 | No | Matrix Nets: A New Deep Architecture for Object ... | 2019-08-13 | Code |
| 115 | SOLQ (ResNet50, single scale) | 47.8 | No | SOLQ: Segmenting Objects by Learning Queries | 2021-06-04 | Code |
| 116 | DyHead (ResNeXt-64x4d-101) | 47.7 | No | Dynamic Head: Unifying Object Detection Heads wi... | 2021-06-15 | Code |
| 117 | SAPD (ResNeXt-101, single-scale) | 47.4 | No | Soft Anchor-Point Object Detection | 2019-11-27 | Code |
| 118 | PANet (ResNeXt-101, multi-scale) | 47.4 | No | Path Aggregation Network for Instance Segmentation | 2018-03-05 | Code |
| 119 | HTC (HRNetV2p-W48) | 47.3 | No | Deep High-Resolution Representation Learning for... | 2019-08-20 | Code |
| 120 | HTC (ResNeXt-101-FPN) | 47.1 | No | Hybrid Task Cascade for Instance Segmentation | 2019-01-22 | Code |
| 121 | CenterNet511 (Hourglass-104, multi-scale) | 47 | No | CenterNet: Keypoint Triplets for Object Detection | 2019-04-17 | Code |
| 122 | MAL (ResNeXt101, multi-scale) | 47 | No | Multiple Anchor Learning for Visual Object Detec... | 2019-12-04 | Code |
| 123 | ISTR (ResNet50-FPN-3x) | 46.8 | No | ISTR: End-to-End Instance Segmentation with Tran... | 2021-05-03 | Code |
| 124 | RetinaNet (SpineNet-49, 896x896) | 46.7 | No | SpineNet: Learning Scale-Permuted Backbone for R... | 2019-12-10 | Code |
| 125 | RPDet (ResNet-101-DCN, multi-scale) | 46.5 | No | RepPoints: Point Set Representation for Object D... | 2019-04-25 | Code |
| 126 | HoughNet (MS) | 46.4 | No | HoughNet: Integrating near and long-range eviden... | 2020-07-05 | Code |
| 127 | PPDet (ResNeXt-101-FPN, multiscale) | 46.3 | No | Reducing Label Noise in Anchor-Free Object Detec... | 2020-08-03 | Code |
| 128 | GFLV2 (ResNet-101) | 46.2 | No | Generalized Focal Loss V2: Learning Reliable Loc... | 2020-11-25 | Code |
| 129 | SNIPER (ResNet-101) | 46.1 | No | SNIPER: Efficient Multi-Scale Training | 2018-05-23 | Code |
| 130 | Mask R-CNN (HRNetV2p-W48 + cascade) | 46.1 | No | Deep High-Resolution Representation Learning for... | 2019-08-20 | Code |
| 131 | ResNeXt-64x4d-101 NAS-FCOS @128-256 w/improvements | 46.1 | No | NAS-FCOS: Fast Neural Architecture Search for Ob... | 2019-06-11 | Code |
| 132 | DCNv2 (ResNet-101, multi-scale) | 46 | No | Deformable ConvNets v2: More Deformable, Better ... | 2018-11-27 | Code |
| 133 | Gaussian-FCOS | 46 | No | Localization Uncertainty Estimation for Anchor-F... | 2020-06-28 | - |
| 134 | Cascade R-CNN-FPN (ResNet-101, map-guided) | 45.9 | No | InstaBoost: Boosting Instance Segmentation via P... | 2019-08-21 | Code |
| 135 | MAL (ResNeXt101, single-scale) | 45.9 | No | Multiple Anchor Learning for Visual Object Detec... | 2019-12-04 | Code |
| 136 | CenterMask+VoVNetV2-99 (single-scale) | 45.8 | No | CenterMask : Real-Time Anchor-Free Instance Segm... | 2019-11-15 | Code |
| 137 | D-RFCN + SNIP (DPN-98 with flip, multi-scale) | 45.7 | No | An Analysis of Scale Invariance in Object Detect... | 2017-11-22 | - |
| 138 | YOLOv4 (CD53) | 45.5 | Yes | Scaled-YOLOv4: Scaling Cross Stage Partial Network | 2020-11-16 | Code |
| 139 | AC-FPN Cascade R-CNN(ResNet-101, single scale) | 45 | No | Attention-guided Context Feature Pyramid Network... | 2020-05-23 | Code |
| 140 | FreeAnchor (ResNeXt-101) | 44.8 | No | FreeAnchor: Learning to Match Anchors for Visual... | 2019-09-05 | Code |
| 141 | FCOS (ResNeXt-64x4d-101-FPN 4 + improvements) | 44.7 | No | FCOS: Fully Convolutional One-Stage Object Detec... | 2019-04-02 | Code |
| 142 | CenterMask+VoVNet2-57 (single-scale) | 44.7 | No | CenterMask : Real-Time Anchor-Free Instance Segm... | 2019-11-15 | Code |
| 143 | FSAF (ResNeXt-101, multi-scale) | 44.6 | No | Feature Selective Anchor-Free Module for Single-... | 2019-03-02 | Code |
| 144 | aLRP Loss (ResNext-101, DCN, 500 scale) | 44.6 | No | A Ranking-based, Balanced Loss Function Unifying... | 2020-09-28 | Code |
| 145 | CenterMask + X-101-32x8d (single-scale) | 44.6 | No | CenterMask : Real-Time Anchor-Free Instance Segm... | 2019-11-15 | Code |
| 146 | RetinaNet (SpineNet-49, 640x640) | 44.3 | No | SpineNet: Learning Scale-Permuted Backbone for R... | 2019-12-10 | Code |
| 147 | YOLOF-DC5 | 44.3 | No | You Only Look One-level Feature | 2021-03-17 | Code |
| 148 | GFLV2 (ResNet-50) | 44.3 | No | Generalized Focal Loss V2: Learning Reliable Loc... | 2020-11-25 | Code |
| 149 | InterNet (ResNet-101-FPN, multi-scale) | 44.2 | No | Feature Intertwiner for Object Detection | 2019-03-28 | Code |
| 150 | M2Det (VGG-16, multi-scale) | 44.2 | No | M2Det: A Single-Shot Object Detector based on Mu... | 2018-11-12 | Code |
| 151 | Faster R-CNN (LIP-ResNet-101-MD w FPN) | 43.9 | No | LIP: Local Importance-based Pooling | 2019-08-12 | Code |
| 152 | M2Det (ResNet-101, multi-scale) | 43.9 | No | M2Det: A Single-Shot Object Detector based on Mu... | 2018-11-12 | Code |
| 153 | YOLOv3 @800 + ASFF* (Darknet-53) | 43.9 | Yes | Learning Spatial Fusion for Single-Shot Object D... | 2019-11-21 | Code |
| 154 | FoveaBox (ResNeXt-101) | 43.9 | No | FoveaBox: Beyond Anchor-based Object Detector | 2019-04-08 | Code |
| 155 | ExtremeNet (Hourglass-104, multi-scale) | 43.7 | No | Bottom-up Object Detection by Grouping Extreme a... | 2019-01-23 | Code |
| 156 | YOLOv4-608 | 43.5 | Yes | YOLOv4: Optimal Speed and Accuracy of Object Det... | 2020-04-23 | Code |
| 157 | SNIPER (ResNet-50) | 43.5 | No | SNIPER: Efficient Multi-Scale Training | 2018-05-23 | Code |
| 158 | CenterNet (HRNetV2-W48) | 43.5 | No | Deep High-Resolution Representation Learning for... | 2019-08-20 | Code |
| 159 | D-RFCN + SNIP (ResNet-101, multi-scale) | 43.4 | No | An Analysis of Scale Invariance in Object Detect... | 2017-11-22 | - |
| 160 | Grid R-CNN (ResNeXt-101-FPN) | 43.2 | No | Grid R-CNN | 2018-11-29 | Code |
| 161 | FCOS (ResNeXt-101-64x4d-FPN) | 43.2 | No | FCOS: Fully Convolutional One-Stage Object Detec... | 2019-04-02 | Code |
| 162 | CornerNet-Saccade (Hourglass-104, multi-scale) | 43.2 | No | CornerNet-Lite: Efficient Keypoint Based Object ... | 2019-04-18 | Code |
| 163 | PP-YOLOE-s(CSPRepResNet-s, 640x640, single-scale ) | 43.1 | No | PP-YOLOE: An evolved version of YOLO | 2022-03-30 | Code |
| 164 | Libra R-CNN (ResNeXt-101-FPN) | 43 | No | Libra R-CNN: Towards Balanced Learning for Objec... | 2019-04-04 | Code |
| 165 | DyHead (ResNet-50) | 43 | No | Dynamic Head: Unifying Object Detection Heads wi... | 2021-06-15 | Code |
| 166 | RPDet (ResNet-101-DCN) | 42.8 | No | RepPoints: Point Set Representation for Object D... | 2019-04-25 | Code |
| 167 | SpineNet-49 (640, RetinaNet, single-scale) | 42.8 | No | SpineNet: Learning Scale-Permuted Backbone for R... | 2019-12-10 | Code |
| 168 | Cascade R-CNN (ResNet-101-FPN+, cascade) | 42.8 | No | Cascade R-CNN: Delving into High Quality Object ... | 2017-12-03 | Code |
| 169 | Cascade R-CNN | 42.8 | No | Cascade R-CNN: High Quality Object Detection and... | 2019-06-24 | Code |
| 170 | TridentNet (ResNet-101) | 42.7 | No | Scale-Aware Trident Networks for Object Detection | 2019-01-07 | Code |
| 171 | FCOS (ResNeXt-32x8d-101-FPN) | 42.7 | No | FCOS: Fully Convolutional One-Stage Object Detec... | 2019-04-02 | Code |
| 172 | RetinaMask (ResNeXt-101-FPN-GN) | 42.6 | No | RetinaMask: Learning to predict masks improves s... | 2019-01-10 | Code |
| 173 | TAL + TAP | 42.5 | No | TOOD: Task-aligned One-stage Object Detection | 2021-08-17 | Code |
| 174 | Faster R-CNN (HRNetV2p-W48) | 42.4 | No | Deep High-Resolution Representation Learning for... | 2019-08-20 | Code |
| 175 | HSD (Rest101, 768x768, single-scale test) | 42.3 | No | - | - | Code |
| 176 | CornerNet511 (Hourglass-104, multi-scale) | 42.1 | No | CornerNet: Detecting Objects as Paired Keypoints | 2018-08-03 | Code |
| 177 | FoveaBox (ResNeXt-101) | 42.1 | No | FoveaBox: Beyond Anchor-based Object Detector | 2019-04-08 | Code |
| 178 | FCOS (HRNet-W32-5l) | 42 | No | FCOS: Fully Convolutional One-Stage Object Detec... | 2019-04-02 | Code |
| 179 | FoveaBox (ResNeXt-101) | 41.9 | No | FoveaBox: Beyond Anchor-based Object Detector | 2019-04-08 | Code |
| 180 | RefineDet512+ (ResNet-101) | 41.8 | No | Single-Shot Refinement Neural Network for Object... | 2017-11-18 | Code |
| 181 | GHM-C + GHM-R (RetinaNet-FPN-ResNeXt-101) | 41.6 | No | Gradient Harmonized Single-stage Detector | 2018-11-13 | Code |
| 182 | CenterNet-DLA (DLA-34, multi-scale) | 41.6 | No | Objects as Points | 2019-04-16 | Code |
| 183 | RetinaNet (SpineNet-49S, 640x640) | 41.5 | No | SpineNet: Learning Scale-Permuted Backbone for R... | 2019-12-10 | Code |
| 184 | RPDet (ResNet-101) | 41 | No | RepPoints: Point Set Representation for Object D... | 2019-04-25 | Code |
| 185 | M2Det (VGG-16, single-scale) | 41 | No | M2Det: A Single-Shot Object Detector based on Mu... | 2018-11-12 | Code |
| 186 | LeYOLO (Large@768) | 41 | No | LeYOLO, New Scalable and Efficient CNN Architect... | 2024-06-20 | Code |
| 187 | FSAF (ResNet-101, single-scale) | 40.9 | No | Feature Selective Anchor-Free Module for Single-... | 2019-03-02 | Code |
| 188 | RetinaNet (ResNeXt-101-FPN) | 40.8 | No | Focal Loss for Dense Object Detection | 2017-08-07 | Code |
| 189 | Cascade R-CNN (ResNet-50-FPN+, cascade) | 40.6 | No | Cascade R-CNN: Delving into High Quality Object ... | 2017-12-03 | Code |
| 190 | Faster R-CNN (Cascade RPN) | 40.6 | Yes | Cascade RPN: Delving into High-Quality Region Pr... | 2019-09-15 | Code |
| 191 | ResNet-50-DW-DPN (Deformable Kernels) | 40.6 | No | Deformable Kernels: Adapting Effective Receptive... | 2019-10-07 | Code |
| 192 | IoU-Net | 40.6 | No | Acquisition of Localization Confidence for Accur... | 2018-07-30 | Code |
| 193 | FCOS (HRNetV2p-W48) | 40.5 | Yes | Deep High-Resolution Representation Learning for... | 2019-08-20 | Code |
| 194 | ResNet-50-FPN Mask R-CNN + KL Loss + var voting + soft-NMS | 40.4 | No | Bounding Box Regression with Uncertainty for Acc... | 2018-09-23 | Code |
| 195 | RDSNet (ResNet-101, RetinaNet, mask, MBRM) | 40.3 | No | RDSNet: A New Deep Architecture for Reciprocal O... | 2019-12-11 | Code |
| 196 | ExtremeNet (Hourglass-104, single-scale) | 40.2 | No | Bottom-up Object Detection by Grouping Extreme a... | 2019-01-23 | Code |
| 197 | Mask R-CNN (ResNet-101-FPN, CBN) | 40.1 | No | Cross-Iteration Batch Normalization | 2020-02-13 | Code |
| 198 | Fast R-CNN (Cascade RPN) | 40.1 | Yes | Cascade RPN: Delving into High-Quality Region Pr... | 2019-09-15 | Code |
| 199 | Mask R-CNN (ResNeXt-101-FPN) | 39.8 | No | Mask R-CNN | 2017-03-20 | Code |
| 200 | GA-Faster-RCNN | 39.8 | No | Region Proposal by Guided Anchoring | 2019-01-10 | Code |
| 201 | ResNet-50 NAS-FCOS @256 | 39.8 | No | NAS-FCOS: Fast Neural Architecture Search for Ob... | 2019-06-11 | Code |
| 202 | A2MIM (ResNet-50 2x) | 39.8 | No | Architecture-Agnostic Masked Image Modeling -- F... | 2022-05-27 | Code |
| 203 | FPN (ResNet101 backbone) | 39.5 | No | ChainerCV: a Library for Deep Learning in Comput... | 2017-08-28 | Code |
| 204 | RetinaMask (ResNet-50-FPN) | 39.4 | No | RetinaMask: Learning to predict masks improves s... | 2019-01-10 | Code |
| 205 | LeYOLO (Medium@640) | 39.3 | No | LeYOLO, New Scalable and Efficient CNN Architect... | 2024-06-20 | Code |
| 206 | AA-ResNet-10 + RetinaNet | 39.2 | No | Attention Augmented Convolutional Networks | 2019-04-22 | Code |
| 207 | MAL (ResNet50, single-scale) | 39.2 | No | Multiple Anchor Learning for Visual Object Detec... | 2019-12-04 | Code |
| 208 | RetinaNet (ResNet-101-FPN) | 39.1 | No | Focal Loss for Dense Object Detection | 2017-08-07 | Code |
| 209 | Cascade R-CNN (ResNet-101-FPN+) | 38.8 | No | Cascade R-CNN: Delving into High Quality Object ... | 2017-12-03 | Code |
| 210 | M2Det (ResNet-101, single-scale) | 38.8 | No | M2Det: A Single-Shot Object Detector based on Mu... | 2018-11-12 | Code |
| 211 | SaccadeNet (DLA-34-DCN) | 38.5 | No | SaccadeNet: A Fast and Accurate Object Detector | 2020-03-26 | Code |
| 212 | Mask R-CNN (ResNet-101-FPN) | 38.2 | No | Mask R-CNN | 2017-03-20 | Code |
| 213 | LeYOLO (Small@640) | 38.2 | No | LeYOLO, New Scalable and Efficient CNN Architect... | 2024-06-20 | Code |
| 214 | WSMA-Seg | 38.1 | No | Segmentation is All You Need | 2019-04-30 | - |
| 215 | Faster R-CNN + FPN + CGD | 37.9 | No | Compact Global Descriptor for Neural Networks | 2019-07-23 | Code |
| 216 | CornerNet511 (Hourglass-52, single-scale) | 37.8 | No | CornerNet: Detecting Objects as Paired Keypoints | 2018-08-03 | Code |
| 217 | RefineDet512+ (VGG-16) | 37.6 | No | Single-Shot Refinement Neural Network for Object... | 2017-11-18 | Code |
| 218 | DeformConv-R-FCN (Aligned-Inception-ResNet) | 37.5 | No | Deformable Convolutional Networks | 2017-03-17 | Code |
| 219 | Faster R-CNN (ImageNet+300M) | 37.4 | No | Revisiting Unreasonable Effectiveness of Data in... | 2017-07-10 | Code |
| 220 | Mask R-CNN (Bottleneck-injected ResNet-50, FPN) | 36.9 | No | torchdistill: A Modular, Configuration-Driven Fr... | 2020-11-25 | Code |
| 221 | Faster R-CNN + TDM | 36.8 | No | Beyond Skip Connections: Top-Down Modulation for... | 2016-12-20 | Code |
| 222 | Cascade R-CNN (ResNet-50-FPN+) | 36.5 | No | Cascade R-CNN: Delving into High Quality Object ... | 2017-12-03 | Code |
| 223 | RefineDet512 (ResNet-101) | 36.4 | No | Single-Shot Refinement Neural Network for Object... | 2017-11-18 | Code |
| 224 | Faster R-CNN + FPN | 36.2 | Yes | Feature Pyramid Networks for Object Detection | 2016-12-09 | Code |
| 225 | Faster R-CNN (Bottleneck-injected ResNet-50 and FPN) | 35.9 | No | torchdistill: A Modular, Configuration-Driven Fr... | 2020-11-25 | Code |