TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Disentangling Monocular 3D Object Detection

Disentangling Monocular 3D Object Detection

Andrea Simonelli, Samuel Rota Rota Bulò, Lorenzo Porzi, Manuel López-Antequera, Peter Kontschieder

2019-05-29ICCV 2019 103D Object Detection From Monocular ImagesMonocular 3D Object DetectionDisentanglementobject-detection3D Object DetectionObject Detection
PaperPDF

Abstract

In this paper we propose an approach for monocular 3D object detection from a single RGB image, which leverages a novel disentangling transformation for 2D and 3D detection losses and a novel, self-supervised confidence score for 3D bounding boxes. Our proposed loss disentanglement has the twofold advantage of simplifying the training dynamics in the presence of losses with complex interactions of parameters, and sidestepping the issue of balancing independent regression terms. Our solution overcomes these issues by isolating the contribution made by groups of parameters to a given loss, without changing its nature. We further apply loss disentanglement to another novel, signed Intersection-over-Union criterion-driven loss for improving 2D detection results. Besides our methodological innovations, we critically review the AP metric used in KITTI3D, which emerged as the most important dataset for comparing 3D detection results. We identify and resolve a flaw in the 11-point interpolated AP metric, affecting all previously published detection results and particularly biases the results of monocular 3D detection. We provide extensive experimental evaluations and ablation studies on the KITTI3D and nuScenes datasets, setting new state-of-the-art results on object category car by large margins.

Results

TaskDatasetMetricValueModel
Object DetectionnuScenes CarsAOE0.08MonoDIS
Object DetectionnuScenes CarsAP 0.5m10.7MonoDIS
Object DetectionnuScenes CarsAP 1.0m37.5MonoDIS
Object DetectionnuScenes CarsAP 2.0m69MonoDIS
Object DetectionnuScenes CarsAP 4.0m85.7MonoDIS
Object DetectionnuScenes CarsASE0.15MonoDIS
Object DetectionnuScenes CarsATE0.61MonoDIS
3DnuScenes CarsAOE0.08MonoDIS
3DnuScenes CarsAP 0.5m10.7MonoDIS
3DnuScenes CarsAP 1.0m37.5MonoDIS
3DnuScenes CarsAP 2.0m69MonoDIS
3DnuScenes CarsAP 4.0m85.7MonoDIS
3DnuScenes CarsASE0.15MonoDIS
3DnuScenes CarsATE0.61MonoDIS
2D ClassificationnuScenes CarsAOE0.08MonoDIS
2D ClassificationnuScenes CarsAP 0.5m10.7MonoDIS
2D ClassificationnuScenes CarsAP 1.0m37.5MonoDIS
2D ClassificationnuScenes CarsAP 2.0m69MonoDIS
2D ClassificationnuScenes CarsAP 4.0m85.7MonoDIS
2D ClassificationnuScenes CarsASE0.15MonoDIS
2D ClassificationnuScenes CarsATE0.61MonoDIS
2D Object DetectionnuScenes CarsAOE0.08MonoDIS
2D Object DetectionnuScenes CarsAP 0.5m10.7MonoDIS
2D Object DetectionnuScenes CarsAP 1.0m37.5MonoDIS
2D Object DetectionnuScenes CarsAP 2.0m69MonoDIS
2D Object DetectionnuScenes CarsAP 4.0m85.7MonoDIS
2D Object DetectionnuScenes CarsASE0.15MonoDIS
2D Object DetectionnuScenes CarsATE0.61MonoDIS
16knuScenes CarsAOE0.08MonoDIS
16knuScenes CarsAP 0.5m10.7MonoDIS
16knuScenes CarsAP 1.0m37.5MonoDIS
16knuScenes CarsAP 2.0m69MonoDIS
16knuScenes CarsAP 4.0m85.7MonoDIS
16knuScenes CarsASE0.15MonoDIS
16knuScenes CarsATE0.61MonoDIS

Related Papers

CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models2025-07-18A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15Towards Imperceptible JPEG Image Hiding: Multi-range Representations-driven Adversarial Stego Generation2025-07-11