TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/DAMO-YOLO : A Report on Real-Time Object Detection Design

DAMO-YOLO : A Report on Real-Time Object Detection Design

Xianzhe Xu, Yiqi Jiang, Weihua Chen, Yilun Huang, Yuan Zhang, Xiuyu Sun

2022-11-23Real-Time Object DetectionNeural Architecture Searchobject-detectionObject Detection
PaperPDFCodeCode(official)Code(official)

Abstract

In this report, we present a fast and accurate object detection method dubbed DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO series. DAMO-YOLO is extended from YOLO with some new technologies, including Neural Architecture Search (NAS), efficient Reparameterized Generalized-FPN (RepGFPN), a lightweight head with AlignedOTA label assignment, and distillation enhancement. In particular, we use MAE-NAS, a method guided by the principle of maximum entropy, to search our detection backbone under the constraints of low latency and high performance, producing ResNet/CSP-like structures with spatial pyramid pooling and focus modules. In the design of necks and heads, we follow the rule of ``large neck, small head''.We import Generalized-FPN with accelerated queen-fusion to build the detector neck and upgrade its CSPNet with efficient layer aggregation networks (ELAN) and reparameterization. Then we investigate how detector head size affects detection performance and find that a heavy neck with only one task projection layer would yield better results.In addition, AlignedOTA is proposed to solve the misalignment problem in label assignment. And a distillation schema is introduced to improve performance to a higher level. Based on these new techs, we build a suite of models at various scales to meet the needs of different scenarios. For general industry requirements, we propose DAMO-YOLO-T/S/M/L. They can achieve 43.6/47.7/50.2/51.9 mAPs on COCO with the latency of 2.78/3.83/5.62/7.95 ms on T4 GPUs respectively. Additionally, for edge devices with limited computing power, we have also proposed DAMO-YOLO-Ns/Nm/Nl lightweight models. They can achieve 32.3/38.2/40.5 mAPs on COCO with the latency of 4.08/5.05/6.69 ms on X86-CPU. Our proposed general and lightweight models have outperformed other YOLO series models in their respective application scenarios.

Results

TaskDatasetMetricValueModel
Object DetectionCOCO (Common Objects in Context)FPS (V100, b=1)126DAMO-YOLO-L
Object DetectionCOCO (Common Objects in Context)box AP50.8DAMO-YOLO-L
Object DetectionCOCO (Common Objects in Context)FPS (V100, b=1)233DAMO-YOLO-M
Object DetectionCOCO (Common Objects in Context)box AP49.2DAMO-YOLO-M
Object DetectionCOCO (Common Objects in Context)FPS (V100, b=1)325DAMO-YOLO-S
Object DetectionCOCO (Common Objects in Context)box AP46DAMO-YOLO-S
Object DetectionCOCO (Common Objects in Context)FPS (V100, b=1)397DAMO-YOLO-T
Object DetectionCOCO (Common Objects in Context)box AP42DAMO-YOLO-T
3DCOCO (Common Objects in Context)FPS (V100, b=1)126DAMO-YOLO-L
3DCOCO (Common Objects in Context)box AP50.8DAMO-YOLO-L
3DCOCO (Common Objects in Context)FPS (V100, b=1)233DAMO-YOLO-M
3DCOCO (Common Objects in Context)box AP49.2DAMO-YOLO-M
3DCOCO (Common Objects in Context)FPS (V100, b=1)325DAMO-YOLO-S
3DCOCO (Common Objects in Context)box AP46DAMO-YOLO-S
3DCOCO (Common Objects in Context)FPS (V100, b=1)397DAMO-YOLO-T
3DCOCO (Common Objects in Context)box AP42DAMO-YOLO-T
2D ClassificationCOCO (Common Objects in Context)FPS (V100, b=1)126DAMO-YOLO-L
2D ClassificationCOCO (Common Objects in Context)box AP50.8DAMO-YOLO-L
2D ClassificationCOCO (Common Objects in Context)FPS (V100, b=1)233DAMO-YOLO-M
2D ClassificationCOCO (Common Objects in Context)box AP49.2DAMO-YOLO-M
2D ClassificationCOCO (Common Objects in Context)FPS (V100, b=1)325DAMO-YOLO-S
2D ClassificationCOCO (Common Objects in Context)box AP46DAMO-YOLO-S
2D ClassificationCOCO (Common Objects in Context)FPS (V100, b=1)397DAMO-YOLO-T
2D ClassificationCOCO (Common Objects in Context)box AP42DAMO-YOLO-T
2D Object DetectionCOCO (Common Objects in Context)FPS (V100, b=1)126DAMO-YOLO-L
2D Object DetectionCOCO (Common Objects in Context)box AP50.8DAMO-YOLO-L
2D Object DetectionCOCO (Common Objects in Context)FPS (V100, b=1)233DAMO-YOLO-M
2D Object DetectionCOCO (Common Objects in Context)box AP49.2DAMO-YOLO-M
2D Object DetectionCOCO (Common Objects in Context)FPS (V100, b=1)325DAMO-YOLO-S
2D Object DetectionCOCO (Common Objects in Context)box AP46DAMO-YOLO-S
2D Object DetectionCOCO (Common Objects in Context)FPS (V100, b=1)397DAMO-YOLO-T
2D Object DetectionCOCO (Common Objects in Context)box AP42DAMO-YOLO-T
16kCOCO (Common Objects in Context)FPS (V100, b=1)126DAMO-YOLO-L
16kCOCO (Common Objects in Context)box AP50.8DAMO-YOLO-L
16kCOCO (Common Objects in Context)FPS (V100, b=1)233DAMO-YOLO-M
16kCOCO (Common Objects in Context)box AP49.2DAMO-YOLO-M
16kCOCO (Common Objects in Context)FPS (V100, b=1)325DAMO-YOLO-S
16kCOCO (Common Objects in Context)box AP46DAMO-YOLO-S
16kCOCO (Common Objects in Context)FPS (V100, b=1)397DAMO-YOLO-T
16kCOCO (Common Objects in Context)box AP42DAMO-YOLO-T

Related Papers

DASViT: Differentiable Architecture Search for Vision Transformer2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08