TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/D-FINE: Redefine Regression Task in DETRs as Fine-grained ...

D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement

Yansong Peng, Hebei Li, Peixi Wu, Yueyi Zhang, Xiaoyan Sun, Feng Wu

2024-10-17regressionReal-Time Object Detection
PaperPDFCode(official)CodeCodeCodeCode

Abstract

We introduce D-FINE, a powerful real-time object detector that achieves outstanding localization precision by redefining the bounding box regression task in DETR models. D-FINE comprises two key components: Fine-grained Distribution Refinement (FDR) and Global Optimal Localization Self-Distillation (GO-LSD). FDR transforms the regression process from predicting fixed coordinates to iteratively refining probability distributions, providing a fine-grained intermediate representation that significantly enhances localization accuracy. GO-LSD is a bidirectional optimization strategy that transfers localization knowledge from refined distributions to shallower layers through self-distillation, while also simplifying the residual prediction tasks for deeper layers. Additionally, D-FINE incorporates lightweight optimizations in computationally intensive modules and operations, achieving a better balance between speed and accuracy. Specifically, D-FINE-L / X achieves 54.0% / 55.8% AP on the COCO dataset at 124 / 78 FPS on an NVIDIA T4 GPU. When pretrained on Objects365, D-FINE-L / X attains 57.1% / 59.3% AP, surpassing all existing real-time detectors. Furthermore, our method significantly enhances the performance of a wide range of DETR models by up to 5.3% AP with negligible extra parameters and training costs. Our code and pretrained models: https://github.com/Peterande/D-FINE.

Results

TaskDatasetMetricValueModel
Object DetectionCOCO (Common Objects in Context)box AP59.3D-FINE-X+
Object DetectionCOCO (Common Objects in Context)box AP55.8D-FINE-X
Object DetectionCOCO (Common Objects in Context)box AP55.1D-FINE-M+
Object DetectionCOCO (Common Objects in Context)box AP54D-FINE-L
Object DetectionCOCO (Common Objects in Context)box AP52.3D-FINE-M
Object DetectionCOCO (Common Objects in Context)box AP50.7D-FINE-S+
Object DetectionCOCO (Common Objects in Context)box AP48.5D-FINE-S
3DCOCO (Common Objects in Context)box AP59.3D-FINE-X+
3DCOCO (Common Objects in Context)box AP55.8D-FINE-X
3DCOCO (Common Objects in Context)box AP55.1D-FINE-M+
3DCOCO (Common Objects in Context)box AP54D-FINE-L
3DCOCO (Common Objects in Context)box AP52.3D-FINE-M
3DCOCO (Common Objects in Context)box AP50.7D-FINE-S+
3DCOCO (Common Objects in Context)box AP48.5D-FINE-S
2D ClassificationCOCO (Common Objects in Context)box AP59.3D-FINE-X+
2D ClassificationCOCO (Common Objects in Context)box AP55.8D-FINE-X
2D ClassificationCOCO (Common Objects in Context)box AP55.1D-FINE-M+
2D ClassificationCOCO (Common Objects in Context)box AP54D-FINE-L
2D ClassificationCOCO (Common Objects in Context)box AP52.3D-FINE-M
2D ClassificationCOCO (Common Objects in Context)box AP50.7D-FINE-S+
2D ClassificationCOCO (Common Objects in Context)box AP48.5D-FINE-S
2D Object DetectionCOCO (Common Objects in Context)box AP59.3D-FINE-X+
2D Object DetectionCOCO (Common Objects in Context)box AP55.8D-FINE-X
2D Object DetectionCOCO (Common Objects in Context)box AP55.1D-FINE-M+
2D Object DetectionCOCO (Common Objects in Context)box AP54D-FINE-L
2D Object DetectionCOCO (Common Objects in Context)box AP52.3D-FINE-M
2D Object DetectionCOCO (Common Objects in Context)box AP50.7D-FINE-S+
2D Object DetectionCOCO (Common Objects in Context)box AP48.5D-FINE-S
16kCOCO (Common Objects in Context)box AP59.3D-FINE-X+
16kCOCO (Common Objects in Context)box AP55.8D-FINE-X
16kCOCO (Common Objects in Context)box AP55.1D-FINE-M+
16kCOCO (Common Objects in Context)box AP54D-FINE-L
16kCOCO (Common Objects in Context)box AP52.3D-FINE-M
16kCOCO (Common Objects in Context)box AP50.7D-FINE-S+
16kCOCO (Common Objects in Context)box AP48.5D-FINE-S

Related Papers

Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression2025-07-20Neural Network-Guided Symbolic Regression for Interpretable Descriptor Discovery in Perovskite Catalysts2025-07-16Imbalanced Regression Pipeline Recommendation2025-07-16Second-Order Bounds for [0,1]-Valued Regression via Betting Loss2025-07-16Sparse Regression Codes exploit Multi-User Diversity without CSI2025-07-15Bradley-Terry and Multi-Objective Reward Modeling Are Complementary2025-07-10Detection of Rail Line Track and Human Beings Near the Track to Avoid Accidents2025-07-03Active Learning for Manifold Gaussian Process Regression2025-06-26