TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Virtual Sparse Convolution for Multimodal 3D Object Detect...

Virtual Sparse Convolution for Multimodal 3D Object Detection

Hai Wu, Chenglu Wen, Shaoshuai Shi, Xin Li, Cheng Wang

2023-03-04CVPR 2023 1Depth CompletionMultiple Object Tracking3D Multi-Object Trackingobject-detection3D Object DetectionObject Detection
PaperPDFCode(official)

Abstract

Recently, virtual/pseudo-point-based 3D object detection that seamlessly fuses RGB images and LiDAR data by depth completion has gained great attention. However, virtual points generated from an image are very dense, introducing a huge amount of redundant computation during detection. Meanwhile, noises brought by inaccurate depth completion significantly degrade detection precision. This paper proposes a fast yet effective backbone, termed VirConvNet, based on a new operator VirConv (Virtual Sparse Convolution), for virtual-point-based 3D object detection. VirConv consists of two key designs: (1) StVD (Stochastic Voxel Discard) and (2) NRConv (Noise-Resistant Submanifold Convolution). StVD alleviates the computation problem by discarding large amounts of nearby redundant voxels. NRConv tackles the noise problem by encoding voxel features in both 2D image and 3D LiDAR space. By integrating VirConv, we first develop an efficient pipeline VirConv-L based on an early fusion design. Then, we build a high-precision pipeline VirConv-T based on a transformed refinement scheme. Finally, we develop a semi-supervised pipeline VirConv-S based on a pseudo-label framework. On the KITTI car 3D detection test leaderboard, our VirConv-L achieves 85% AP with a fast running speed of 56ms. Our VirConv-T and VirConv-S attains a high-precision of 86.3% and 87.2% AP, and currently rank 2nd and 1st, respectively. The code is available at https://github.com/hailanyi/VirConv.

Results

TaskDatasetMetricValueModel
VideoKITTI Test (Online Methods)HOTA79.9VirConvTrack
VideoKITTI Test (Online Methods)IDSW201VirConvTrack
VideoKITTI Test (Online Methods)MOTA89.1VirConvTrack
Object TrackingKITTI Test (Online Methods)HOTA79.9VirConvTrack
Object TrackingKITTI Test (Online Methods)IDSW201VirConvTrack
Object TrackingKITTI Test (Online Methods)MOTA89.1VirConvTrack
Multiple Object TrackingKITTI Test (Online Methods)HOTA79.9VirConvTrack
Multiple Object TrackingKITTI Test (Online Methods)IDSW201VirConvTrack
Multiple Object TrackingKITTI Test (Online Methods)MOTA89.1VirConvTrack

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15PacGDC: Label-Efficient Generalizable Depth Completion with Projection Ambiguity and Consistency2025-07-10ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge2025-07-08