TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/YOLOv6: A Single-Stage Object Detection Framework for Indu...

YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications

Chuyi Li, Lulu Li, Hongliang Jiang, Kaiheng Weng, Yifei Geng, Liang Li, Zaidan Ke, Qingyuan Li, Meng Cheng, Weiqiang Nie, Yiduo Li, Bo Zhang, Yufei Liang, Linyuan Zhou, Xiaoming Xu, Xiangxiang Chu, Xiaoming Wei, Xiaolin Wei

2022-09-07QuantizationReal-Time Object DetectionPedestrian DetectionObject Detection
PaperPDFCodeCodeCodeCodeCodeCode(official)Code

Abstract

For years, the YOLO series has been the de facto industry-level standard for efficient object detection. The YOLO community has prospered overwhelmingly to enrich its use in a multitude of hardware platforms and abundant scenarios. In this technical report, we strive to push its limits to the next level, stepping forward with an unwavering mindset for industry application. Considering the diverse requirements for speed and accuracy in the real environment, we extensively examine the up-to-date object detection advancements either from industry or academia. Specifically, we heavily assimilate ideas from recent network design, training strategies, testing techniques, quantization, and optimization methods. On top of this, we integrate our thoughts and practice to build a suite of deployment-ready networks at various scales to accommodate diversified use cases. With the generous permission of YOLO authors, we name it YOLOv6. We also express our warm welcome to users and contributors for further enhancement. For a glimpse of performance, our YOLOv6-N hits 35.9% AP on the COCO dataset at a throughput of 1234 FPS on an NVIDIA Tesla T4 GPU. YOLOv6-S strikes 43.5% AP at 495 FPS, outperforming other mainstream detectors at the same scale~(YOLOv5-S, YOLOX-S, and PPYOLOE-S). Our quantized version of YOLOv6-S even brings a new state-of-the-art 43.3% AP at 869 FPS. Furthermore, YOLOv6-M/L also achieves better accuracy performance (i.e., 49.5%/52.3%) than other detectors with a similar inference speed. We carefully conducted experiments to validate the effectiveness of each component. Our code is made available at https://github.com/meituan/YOLOv6.

Results

TaskDatasetMetricValueModel
Autonomous VehiclesDVTOD mAP84.4YOLOv6 (Thermal)
Autonomous VehiclesDVTOD mAP38.1YOLOv6 (Visible)
Object DetectionCOCO-OAverage mAP32.5YOLOv6-L6
Object DetectionCOCO-OEffective Robustness6.73YOLOv6-L6
Object DetectionCOCO (Common Objects in Context)FPS (V100, b=1)26YOLOv6-L6(1280)
Object DetectionCOCO (Common Objects in Context)box AP57.2YOLOv6-L6(1280)
3DCOCO-OAverage mAP32.5YOLOv6-L6
3DCOCO-OEffective Robustness6.73YOLOv6-L6
3DCOCO (Common Objects in Context)FPS (V100, b=1)26YOLOv6-L6(1280)
3DCOCO (Common Objects in Context)box AP57.2YOLOv6-L6(1280)
2D ClassificationCOCO-OAverage mAP32.5YOLOv6-L6
2D ClassificationCOCO-OEffective Robustness6.73YOLOv6-L6
2D ClassificationCOCO (Common Objects in Context)FPS (V100, b=1)26YOLOv6-L6(1280)
2D ClassificationCOCO (Common Objects in Context)box AP57.2YOLOv6-L6(1280)
Pedestrian DetectionDVTOD mAP84.4YOLOv6 (Thermal)
Pedestrian DetectionDVTOD mAP38.1YOLOv6 (Visible)
2D Object DetectionCOCO-OAverage mAP32.5YOLOv6-L6
2D Object DetectionCOCO-OEffective Robustness6.73YOLOv6-L6
2D Object DetectionCOCO (Common Objects in Context)FPS (V100, b=1)26YOLOv6-L6(1280)
2D Object DetectionCOCO (Common Objects in Context)box AP57.2YOLOv6-L6(1280)
16kCOCO-OAverage mAP32.5YOLOv6-L6
16kCOCO-OEffective Robustness6.73YOLOv6-L6
16kCOCO (Common Objects in Context)FPS (V100, b=1)26YOLOv6-L6(1280)
16kCOCO (Common Objects in Context)box AP57.2YOLOv6-L6(1280)

Related Papers

Efficient Deployment of Spiking Neural Networks on SpiNNaker2 for DVS Gesture Recognition Using Neuromorphic Intermediate Representation2025-09-04An End-to-End DNN Inference Framework for the SpiNNaker2 Neuromorphic MPSoC2025-07-18Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17Angle Estimation of a Single Source with Massive Uniform Circular Arrays2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17