TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/YOLOv9: Learning What You Want to Learn Using Programmable...

YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Chien-Yao Wang, I-Hau Yeh, Hong-Yuan Mark Liao

2024-02-21Real-Time Object Detectionobject-detectionObject Detection
PaperPDFCode(official)CodeCodeCodeCode

Abstract

Today's deep learning methods focus on how to design the most appropriate objective functions so that the prediction results of the model can be closest to the ground truth. Meanwhile, an appropriate architecture that can facilitate acquisition of enough information for prediction has to be designed. Existing methods ignore a fact that when input data undergoes layer-by-layer feature extraction and spatial transformation, large amount of information will be lost. This paper will delve into the important issues of data loss when data is transmitted through deep networks, namely information bottleneck and reversible functions. We proposed the concept of programmable gradient information (PGI) to cope with the various changes required by deep networks to achieve multiple objectives. PGI can provide complete input information for the target task to calculate objective function, so that reliable gradient information can be obtained to update network weights. In addition, a new lightweight network architecture -- Generalized Efficient Layer Aggregation Network (GELAN), based on gradient path planning is designed. GELAN's architecture confirms that PGI has gained superior results on lightweight models. We verified the proposed GELAN and PGI on MS COCO dataset based object detection. The results show that GELAN only uses conventional convolution operators to achieve better parameter utilization than the state-of-the-art methods developed based on depth-wise convolution. PGI can be used for variety of models from lightweight to large. It can be used to obtain complete information, so that train-from-scratch models can achieve better results than state-of-the-art models pre-trained using large datasets, the comparison results are shown in Figure 1. The source codes are at: https://github.com/WongKinYiu/yolov9.

Results

TaskDatasetMetricValueModel
Object DetectionCOCO (Common Objects in Context)box AP55.6YOLOv9-E
Object DetectionCOCO (Common Objects in Context)box AP55GELAN-E
Object DetectionCOCO (Common Objects in Context)box AP53YOLOv9-C
Object DetectionCOCO (Common Objects in Context)box AP52.5GELAN-C
Object DetectionCOCO (Common Objects in Context)box AP51.4YOLOv9-M
Object DetectionCOCO (Common Objects in Context)box AP51.1GELAN-M
Object DetectionCOCO (Common Objects in Context)box AP46.8YOLOv9-S
Object DetectionCOCO (Common Objects in Context)box AP46.7GELAN-S
3DCOCO (Common Objects in Context)box AP55.6YOLOv9-E
3DCOCO (Common Objects in Context)box AP55GELAN-E
3DCOCO (Common Objects in Context)box AP53YOLOv9-C
3DCOCO (Common Objects in Context)box AP52.5GELAN-C
3DCOCO (Common Objects in Context)box AP51.4YOLOv9-M
3DCOCO (Common Objects in Context)box AP51.1GELAN-M
3DCOCO (Common Objects in Context)box AP46.8YOLOv9-S
3DCOCO (Common Objects in Context)box AP46.7GELAN-S
2D ClassificationCOCO (Common Objects in Context)box AP55.6YOLOv9-E
2D ClassificationCOCO (Common Objects in Context)box AP55GELAN-E
2D ClassificationCOCO (Common Objects in Context)box AP53YOLOv9-C
2D ClassificationCOCO (Common Objects in Context)box AP52.5GELAN-C
2D ClassificationCOCO (Common Objects in Context)box AP51.4YOLOv9-M
2D ClassificationCOCO (Common Objects in Context)box AP51.1GELAN-M
2D ClassificationCOCO (Common Objects in Context)box AP46.8YOLOv9-S
2D ClassificationCOCO (Common Objects in Context)box AP46.7GELAN-S
2D Object DetectionCOCO (Common Objects in Context)box AP55.6YOLOv9-E
2D Object DetectionCOCO (Common Objects in Context)box AP55GELAN-E
2D Object DetectionCOCO (Common Objects in Context)box AP53YOLOv9-C
2D Object DetectionCOCO (Common Objects in Context)box AP52.5GELAN-C
2D Object DetectionCOCO (Common Objects in Context)box AP51.4YOLOv9-M
2D Object DetectionCOCO (Common Objects in Context)box AP51.1GELAN-M
2D Object DetectionCOCO (Common Objects in Context)box AP46.8YOLOv9-S
2D Object DetectionCOCO (Common Objects in Context)box AP46.7GELAN-S
16kCOCO (Common Objects in Context)box AP55.6YOLOv9-E
16kCOCO (Common Objects in Context)box AP55GELAN-E
16kCOCO (Common Objects in Context)box AP53YOLOv9-C
16kCOCO (Common Objects in Context)box AP52.5GELAN-C
16kCOCO (Common Objects in Context)box AP51.4YOLOv9-M
16kCOCO (Common Objects in Context)box AP51.1GELAN-M
16kCOCO (Common Objects in Context)box AP46.8YOLOv9-S
16kCOCO (Common Objects in Context)box AP46.7GELAN-S

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations2025-07-07