TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Is Pseudo-Lidar needed for Monocular 3D Object detection?

Is Pseudo-Lidar needed for Monocular 3D Object detection?

Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, Adrien Gaidon

2021-08-13ICCV 2021 10Monocular 3D Object DetectionSelf-Supervised LearningDepth Estimationobject-detection3D Object DetectionObject DetectionMonocular Depth Estimation
PaperPDFCodeCode(official)

Abstract

Recent progress in 3D object detection from single images leverages monocular depth estimation as a way to produce 3D pointclouds, turning cameras into pseudo-lidar sensors. These two-stage detectors improve with the accuracy of the intermediate depth estimation network, which can itself be improved without manual labels via large-scale self-supervised learning. However, they tend to suffer from overfitting more than end-to-end methods, are more complex, and the gap with similar lidar-based detectors remains significant. In this work, we propose an end-to-end, single stage, monocular 3D object detector, DD3D, that can benefit from depth pre-training like pseudo-lidar methods, but without their limitations. Our architecture is designed for effective information transfer between depth estimation and 3D detection, allowing us to scale with the amount of unlabeled pre-training data. Our method achieves state-of-the-art results on two challenging benchmarks, with 16.34% and 9.28% AP for Cars and Pedestrians (respectively) on the KITTI-3D benchmark, and 41.5% mAP on NuScenes.

Results

TaskDatasetMetricValueModel
Object DetectionKITTI Cars EasyAP Easy23.22DD3D
Object DetectionKITTI Cars ModerateAP Medium16.34DD3D
Object DetectionKITTI Pedestrian EasyAP Easy13.91DD3D
Object DetectionKITTI Pedestrian HardAP Hard8.05DD3D
Object DetectionKITTI Pedestrian ModerateAP Medium9.3DD3D
Object DetectionKITTI Cars HardAP Hard14.2DD3D
3DKITTI Cars EasyAP Easy23.22DD3D
3DKITTI Cars ModerateAP Medium16.34DD3D
3DKITTI Pedestrian EasyAP Easy13.91DD3D
3DKITTI Pedestrian HardAP Hard8.05DD3D
3DKITTI Pedestrian ModerateAP Medium9.3DD3D
3DKITTI Cars HardAP Hard14.2DD3D
3D Object DetectionKITTI Cars EasyAP Easy23.22DD3D
3D Object DetectionKITTI Cars ModerateAP Medium16.34DD3D
3D Object DetectionKITTI Pedestrian EasyAP Easy13.91DD3D
3D Object DetectionKITTI Pedestrian HardAP Hard8.05DD3D
3D Object DetectionKITTI Pedestrian ModerateAP Medium9.3DD3D
3D Object DetectionKITTI Cars HardAP Hard14.2DD3D
2D ClassificationKITTI Cars EasyAP Easy23.22DD3D
2D ClassificationKITTI Cars ModerateAP Medium16.34DD3D
2D ClassificationKITTI Pedestrian EasyAP Easy13.91DD3D
2D ClassificationKITTI Pedestrian HardAP Hard8.05DD3D
2D ClassificationKITTI Pedestrian ModerateAP Medium9.3DD3D
2D ClassificationKITTI Cars HardAP Hard14.2DD3D
2D Object DetectionKITTI Cars EasyAP Easy23.22DD3D
2D Object DetectionKITTI Cars ModerateAP Medium16.34DD3D
2D Object DetectionKITTI Pedestrian EasyAP Easy13.91DD3D
2D Object DetectionKITTI Pedestrian HardAP Hard8.05DD3D
2D Object DetectionKITTI Pedestrian ModerateAP Medium9.3DD3D
2D Object DetectionKITTI Cars HardAP Hard14.2DD3D
16kKITTI Cars EasyAP Easy23.22DD3D
16kKITTI Cars ModerateAP Medium16.34DD3D
16kKITTI Pedestrian EasyAP Easy13.91DD3D
16kKITTI Pedestrian HardAP Hard8.05DD3D
16kKITTI Pedestrian ModerateAP Medium9.3DD3D
16kKITTI Cars HardAP Hard14.2DD3D

Related Papers

A Semi-Supervised Learning Method for the Identification of Bad Exposures in Large Imaging Surveys2025-07-17$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16