TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/ImVoxelNet: Image to Voxels Projection for Monocular and M...

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

Danila Rukhovich, Anna Vorontsova, Anton Konushin

2021-06-02Monocular 3D Object DetectionRoom Layout Estimationobject-detection3D Object DetectionObject Detection
PaperPDFCodeCode

Abstract

In this paper, we introduce the task of multi-view RGB-based 3D object detection as an end-to-end optimization problem. To address this problem, we propose ImVoxelNet, a novel fully convolutional method of 3D object detection based on monocular or multi-view RGB images. The number of monocular images in each multi-view input can variate during training and inference; actually, this number might be unique for each multi-view input. ImVoxelNet successfully handles both indoor and outdoor scenes, which makes it general-purpose. Specifically, it achieves state-of-the-art results in car detection on KITTI (monocular) and nuScenes (multi-view) benchmarks among all methods that accept RGB images. Moreover, it surpasses existing RGB-based 3D object detection methods on the SUN RGB-D dataset. On ScanNet, ImVoxelNet sets a new benchmark for multi-view 3D object detection. The source code and the trained models are available at https://github.com/saic-vul/imvoxelnet.

Results

TaskDatasetMetricValueModel
Object DetectionDAIR-V2X-IAP|R40(easy)44.8ImVoxelNet
Object DetectionDAIR-V2X-IAP|R40(hard)37.6ImVoxelNet
Object DetectionDAIR-V2X-IAP|R40(moderate)37.6ImVoxelNet
Object DetectionScanNetV2mAP@0.2548.1ImVoxelNet (RGB only)
Object DetectionScanNetV2mAP@0.522.7ImVoxelNet (RGB only)
Object DetectionSUN RGB-DAP@0.15 (10 / NYU-37)42.69ImVoxelNet
Object DetectionSUN RGB-DAP@0.15 (10 / PNet-30)48.74ImVoxelNet
Object DetectionSUN RGB-DAP@0.15 (NYU-37)21.08ImVoxelNet
3DDAIR-V2X-IAP|R40(easy)44.8ImVoxelNet
3DDAIR-V2X-IAP|R40(hard)37.6ImVoxelNet
3DDAIR-V2X-IAP|R40(moderate)37.6ImVoxelNet
3DScanNetV2mAP@0.2548.1ImVoxelNet (RGB only)
3DScanNetV2mAP@0.522.7ImVoxelNet (RGB only)
3DSUN RGB-DAP@0.15 (10 / NYU-37)42.69ImVoxelNet
3DSUN RGB-DAP@0.15 (10 / PNet-30)48.74ImVoxelNet
3DSUN RGB-DAP@0.15 (NYU-37)21.08ImVoxelNet
3D Object DetectionDAIR-V2X-IAP|R40(easy)44.8ImVoxelNet
3D Object DetectionDAIR-V2X-IAP|R40(hard)37.6ImVoxelNet
3D Object DetectionDAIR-V2X-IAP|R40(moderate)37.6ImVoxelNet
3D Object DetectionScanNetV2mAP@0.2548.1ImVoxelNet (RGB only)
3D Object DetectionScanNetV2mAP@0.522.7ImVoxelNet (RGB only)
3D Object DetectionSUN RGB-DAP@0.15 (10 / NYU-37)42.69ImVoxelNet
3D Object DetectionSUN RGB-DAP@0.15 (10 / PNet-30)48.74ImVoxelNet
3D Object DetectionSUN RGB-DAP@0.15 (NYU-37)21.08ImVoxelNet
2D ClassificationDAIR-V2X-IAP|R40(easy)44.8ImVoxelNet
2D ClassificationDAIR-V2X-IAP|R40(hard)37.6ImVoxelNet
2D ClassificationDAIR-V2X-IAP|R40(moderate)37.6ImVoxelNet
2D ClassificationScanNetV2mAP@0.2548.1ImVoxelNet (RGB only)
2D ClassificationScanNetV2mAP@0.522.7ImVoxelNet (RGB only)
2D ClassificationSUN RGB-DAP@0.15 (10 / NYU-37)42.69ImVoxelNet
2D ClassificationSUN RGB-DAP@0.15 (10 / PNet-30)48.74ImVoxelNet
2D ClassificationSUN RGB-DAP@0.15 (NYU-37)21.08ImVoxelNet
2D Object DetectionDAIR-V2X-IAP|R40(easy)44.8ImVoxelNet
2D Object DetectionDAIR-V2X-IAP|R40(hard)37.6ImVoxelNet
2D Object DetectionDAIR-V2X-IAP|R40(moderate)37.6ImVoxelNet
2D Object DetectionScanNetV2mAP@0.2548.1ImVoxelNet (RGB only)
2D Object DetectionScanNetV2mAP@0.522.7ImVoxelNet (RGB only)
2D Object DetectionSUN RGB-DAP@0.15 (10 / NYU-37)42.69ImVoxelNet
2D Object DetectionSUN RGB-DAP@0.15 (10 / PNet-30)48.74ImVoxelNet
2D Object DetectionSUN RGB-DAP@0.15 (NYU-37)21.08ImVoxelNet
16kDAIR-V2X-IAP|R40(easy)44.8ImVoxelNet
16kDAIR-V2X-IAP|R40(hard)37.6ImVoxelNet
16kDAIR-V2X-IAP|R40(moderate)37.6ImVoxelNet
16kScanNetV2mAP@0.2548.1ImVoxelNet (RGB only)
16kScanNetV2mAP@0.522.7ImVoxelNet (RGB only)
16kSUN RGB-DAP@0.15 (10 / NYU-37)42.69ImVoxelNet
16kSUN RGB-DAP@0.15 (10 / PNet-30)48.74ImVoxelNet
16kSUN RGB-DAP@0.15 (NYU-37)21.08ImVoxelNet
Room Layout EstimationSUN RGB-DCamera Pitch2.63ImVoxelNet
Room Layout EstimationSUN RGB-DCamera Roll1.96ImVoxelNet
Room Layout EstimationSUN RGB-DIoU59.3ImVoxelNet

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations2025-07-07