TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Delving into Localization Errors for Monocular 3D Object D...

Delving into Localization Errors for Monocular 3D Object Detection

Xinzhu Ma, Yinmin Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Haojie Li, Wanli Ouyang

2021-03-30CVPR 2021 13D Object Detection From Monocular ImagesMonocular 3D Object DetectionAutonomous Drivingobject-detection3D Object DetectionObject Detection
PaperPDFCode(official)

Abstract

Estimating 3D bounding boxes from monocular images is an essential component in autonomous driving, while accurate 3D object detection from this kind of data is very challenging. In this work, by intensive diagnosis experiments, we quantify the impact introduced by each sub-task and found the `localization error' is the vital factor in restricting monocular 3D detection. Besides, we also investigate the underlying reasons behind localization errors, analyze the issues they might bring, and propose three strategies. First, we revisit the misalignment between the center of the 2D bounding box and the projected center of the 3D object, which is a vital factor leading to low localization accuracy. Second, we observe that accurately localizing distant objects with existing technologies is almost impossible, while those samples will mislead the learned network. To this end, we propose to remove such samples from the training set for improving the overall performance of the detector. Lastly, we also propose a novel 3D IoU oriented loss for the size estimation of the object, which is not affected by `localization error'. We conduct extensive experiments on the KITTI dataset, where the proposed method achieves real-time detection and outperforms previous methods by a large margin. The code will be made available at: https://github.com/xinzhuma/monodle.

Results

TaskDatasetMetricValueModel
Object DetectionRope3DAP@0.713.58MonoDLE+(G)
Object DetectionKITTI Cars ModerateAP Medium12.26MonoDLE
Object DetectionKITTI-360AP2528.99MonoDLE
Object DetectionKITTI-360AP500.85MonoDLE
3DRope3DAP@0.713.58MonoDLE+(G)
3DKITTI Cars ModerateAP Medium12.26MonoDLE
3DKITTI-360AP2528.99MonoDLE
3DKITTI-360AP500.85MonoDLE
3D Object DetectionRope3DAP@0.713.58MonoDLE+(G)
3D Object DetectionKITTI Cars ModerateAP Medium12.26MonoDLE
2D ClassificationRope3DAP@0.713.58MonoDLE+(G)
2D ClassificationKITTI Cars ModerateAP Medium12.26MonoDLE
2D ClassificationKITTI-360AP2528.99MonoDLE
2D ClassificationKITTI-360AP500.85MonoDLE
2D Object DetectionRope3DAP@0.713.58MonoDLE+(G)
2D Object DetectionKITTI Cars ModerateAP Medium12.26MonoDLE
2D Object DetectionKITTI-360AP2528.99MonoDLE
2D Object DetectionKITTI-360AP500.85MonoDLE
16kRope3DAP@0.713.58MonoDLE+(G)
16kKITTI Cars ModerateAP Medium12.26MonoDLE
16kKITTI-360AP2528.99MonoDLE
16kKITTI-360AP500.85MonoDLE

Related Papers

GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving2025-07-19AGENTS-LLM: Augmentative GENeration of Challenging Traffic Scenarios with an Agentic LLM Framework2025-07-18World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models2025-07-17Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17LaViPlan : Language-Guided Visual Path Planning with RLVR2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17