TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CubifAE-3D: Monocular Camera Space Cubification for Auto-E...

CubifAE-3D: Monocular Camera Space Cubification for Auto-Encoder based 3D Object Detection

Shubham Shrivastava, Punarjay Chakravarty

2020-06-07Autonomous Vehicles3D Object Detection From Monocular ImagesMonocular 3D Object Detectionobject-detection3D Object DetectionObject Detection
PaperPDF

Abstract

We introduce a method for 3D object detection using a single monocular image. Starting from a synthetic dataset, we pre-train an RGB-to-Depth Auto-Encoder (AE). The embedding learnt from this AE is then used to train a 3D Object Detector (3DOD) CNN which is used to regress the parameters of 3D object poses after the encoder from the AE generates a latent embedding from the RGB image. We show that we can pre-train the AE using paired RGB and depth images from simulation data once and subsequently only train the 3DOD network using real data, comprising of RGB images and 3D object pose labels (without the requirement of dense depth). Our 3DOD network utilizes a particular `cubification' of 3D space around the camera, where each cuboid is tasked with predicting N object poses, along with their class and confidence values. The AE pre-training and this method of dividing the 3D space around the camera into cuboids give our method its name - CubifAE-3D. We demonstrate results for monocular 3D object detection in the Autonomous Vehicle (AV) use-case with the Virtual KITTI 2 and the KITTI datasets.

Results

TaskDatasetMetricValueModel
Object DetectionKITTI Cars ModerateAP Medium7.94CubifAE-3D
Object DetectionKITTI Pedestrian HardAP Hard4.82CubifAE-3D
Object DetectionKITTI Pedestrians Moderate valAP Medium5.43CubifAE-3D
Object DetectionKITTI Cars HardAP Hard6.42CubifAE-3D
Object DetectionVirtual KITTI 2mAP@0.386.6CubifAE-3D
Object DetectionVirtual KITTI 2mAP@0.566.7CubifAE-3D
3DKITTI Cars ModerateAP Medium7.94CubifAE-3D
3DKITTI Pedestrian HardAP Hard4.82CubifAE-3D
3DKITTI Pedestrians Moderate valAP Medium5.43CubifAE-3D
3DKITTI Cars HardAP Hard6.42CubifAE-3D
3DVirtual KITTI 2mAP@0.386.6CubifAE-3D
3DVirtual KITTI 2mAP@0.566.7CubifAE-3D
3D Object DetectionKITTI Cars ModerateAP Medium7.94CubifAE-3D
3D Object DetectionKITTI Pedestrian HardAP Hard4.82CubifAE-3D
3D Object DetectionKITTI Pedestrians Moderate valAP Medium5.43CubifAE-3D
3D Object DetectionKITTI Cars HardAP Hard6.42CubifAE-3D
3D Object DetectionVirtual KITTI 2mAP@0.386.6CubifAE-3D
3D Object DetectionVirtual KITTI 2mAP@0.566.7CubifAE-3D
2D ClassificationKITTI Cars ModerateAP Medium7.94CubifAE-3D
2D ClassificationKITTI Pedestrian HardAP Hard4.82CubifAE-3D
2D ClassificationKITTI Pedestrians Moderate valAP Medium5.43CubifAE-3D
2D ClassificationKITTI Cars HardAP Hard6.42CubifAE-3D
2D ClassificationVirtual KITTI 2mAP@0.386.6CubifAE-3D
2D ClassificationVirtual KITTI 2mAP@0.566.7CubifAE-3D
2D Object DetectionKITTI Cars ModerateAP Medium7.94CubifAE-3D
2D Object DetectionKITTI Pedestrian HardAP Hard4.82CubifAE-3D
2D Object DetectionKITTI Pedestrians Moderate valAP Medium5.43CubifAE-3D
2D Object DetectionKITTI Cars HardAP Hard6.42CubifAE-3D
2D Object DetectionVirtual KITTI 2mAP@0.386.6CubifAE-3D
2D Object DetectionVirtual KITTI 2mAP@0.566.7CubifAE-3D
16kKITTI Cars ModerateAP Medium7.94CubifAE-3D
16kKITTI Pedestrian HardAP Hard4.82CubifAE-3D
16kKITTI Pedestrians Moderate valAP Medium5.43CubifAE-3D
16kKITTI Cars HardAP Hard6.42CubifAE-3D
16kVirtual KITTI 2mAP@0.386.6CubifAE-3D
16kVirtual KITTI 2mAP@0.566.7CubifAE-3D

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15Fast and Accurate Collision Probability Estimation for Autonomous Vehicles using Adaptive Sigma-Point Sampling2025-07-08ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08