TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Point-Voxel CNN for Efficient 3D Deep Learning

Point-Voxel CNN for Efficient 3D Deep Learning

Zhijian Liu, Haotian Tang, Yujun Lin, Song Han

2019-07-08NeurIPS 2019 12Scene SegmentationDeep Learning3D Semantic Segmentationobject-detection3D Object DetectionObject Detection
PaperPDFCodeCode(official)CodeCode

Abstract

We present Point-Voxel CNN (PVCNN) for efficient, fast 3D deep learning. Previous work processes 3D data using either voxel-based or point-based NN models. However, both approaches are computationally inefficient. The computation cost and memory footprints of the voxel-based models grow cubically with the input resolution, making it memory-prohibitive to scale up the resolution. As for point-based networks, up to 80% of the time is wasted on structuring the sparse data which have rather poor memory locality, not on the actual feature extraction. In this paper, we propose PVCNN that represents the 3D input data in points to reduce the memory consumption, while performing the convolutions in voxels to reduce the irregular, sparse data access and improve the locality. Our PVCNN model is both memory and computation efficient. Evaluated on semantic and part segmentation datasets, it achieves much higher accuracy than the voxel-based baseline with 10x GPU memory reduction; it also outperforms the state-of-the-art point-based models with 7x measured speedup on average. Remarkably, the narrower version of PVCNN achieves 2x speedup over PointNet (an extremely efficient model) on part and scene segmentation benchmarks with much higher accuracy. We validate the general effectiveness of PVCNN on 3D object detection: by replacing the primitives in Frustrum PointNet with PVConv, it outperforms Frustrum PointNet++ by 2.4% mAP on average with 1.5x measured speedup and GPU memory reduction.

Results

TaskDatasetMetricValueModel
Semantic SegmentationS3DISmIoU (6-Fold)58.98PVCNN++
Semantic SegmentationShapeNet-PartInstance Average IoU86.2PVCNN volumetric
Object DetectionKITTI Cars Hard valAP63.81PVCNN
Object DetectionKITTI Cyclist Moderate valAP59.97PVCNN
Object DetectionKITTI Cyclist Easy valAP81.4PVCNN
Object DetectionKITTI Cyclist Hard valAP56.24PVCNN
Object DetectionKITTI Pedestrian Moderate valAP64.71PVCNN
Object DetectionKITTI Pedestrian Hard valAP56.78PVCNN
Object DetectionKITTI Cars Moderate valAP71.54PVCNN
Object DetectionKITTI Pedestrian Easy valAP73.2PVCNN
Object DetectionKITTI Cars Easy valAP84.02PVCNN
3DKITTI Cars Hard valAP63.81PVCNN
3DKITTI Cyclist Moderate valAP59.97PVCNN
3DKITTI Cyclist Easy valAP81.4PVCNN
3DKITTI Cyclist Hard valAP56.24PVCNN
3DKITTI Pedestrian Moderate valAP64.71PVCNN
3DKITTI Pedestrian Hard valAP56.78PVCNN
3DKITTI Cars Moderate valAP71.54PVCNN
3DKITTI Pedestrian Easy valAP73.2PVCNN
3DKITTI Cars Easy valAP84.02PVCNN
3D Semantic SegmentationS3DISmIoU (6-Fold)58.98PVCNN++
3D Object DetectionKITTI Cars Hard valAP63.81PVCNN
3D Object DetectionKITTI Cyclist Moderate valAP59.97PVCNN
3D Object DetectionKITTI Cyclist Easy valAP81.4PVCNN
3D Object DetectionKITTI Cyclist Hard valAP56.24PVCNN
3D Object DetectionKITTI Pedestrian Moderate valAP64.71PVCNN
3D Object DetectionKITTI Pedestrian Hard valAP56.78PVCNN
3D Object DetectionKITTI Cars Moderate valAP71.54PVCNN
3D Object DetectionKITTI Pedestrian Easy valAP73.2PVCNN
3D Object DetectionKITTI Cars Easy valAP84.02PVCNN
2D ClassificationKITTI Cars Hard valAP63.81PVCNN
2D ClassificationKITTI Cyclist Moderate valAP59.97PVCNN
2D ClassificationKITTI Cyclist Easy valAP81.4PVCNN
2D ClassificationKITTI Cyclist Hard valAP56.24PVCNN
2D ClassificationKITTI Pedestrian Moderate valAP64.71PVCNN
2D ClassificationKITTI Pedestrian Hard valAP56.78PVCNN
2D ClassificationKITTI Cars Moderate valAP71.54PVCNN
2D ClassificationKITTI Pedestrian Easy valAP73.2PVCNN
2D ClassificationKITTI Cars Easy valAP84.02PVCNN
2D Object DetectionKITTI Cars Hard valAP63.81PVCNN
2D Object DetectionKITTI Cyclist Moderate valAP59.97PVCNN
2D Object DetectionKITTI Cyclist Easy valAP81.4PVCNN
2D Object DetectionKITTI Cyclist Hard valAP56.24PVCNN
2D Object DetectionKITTI Pedestrian Moderate valAP64.71PVCNN
2D Object DetectionKITTI Pedestrian Hard valAP56.78PVCNN
2D Object DetectionKITTI Cars Moderate valAP71.54PVCNN
2D Object DetectionKITTI Pedestrian Easy valAP73.2PVCNN
2D Object DetectionKITTI Cars Easy valAP84.02PVCNN
10-shot image generationS3DISmIoU (6-Fold)58.98PVCNN++
10-shot image generationShapeNet-PartInstance Average IoU86.2PVCNN volumetric
16kKITTI Cars Hard valAP63.81PVCNN
16kKITTI Cyclist Moderate valAP59.97PVCNN
16kKITTI Cyclist Easy valAP81.4PVCNN
16kKITTI Cyclist Hard valAP56.24PVCNN
16kKITTI Pedestrian Moderate valAP64.71PVCNN
16kKITTI Pedestrian Hard valAP56.78PVCNN
16kKITTI Cars Moderate valAP71.54PVCNN
16kKITTI Pedestrian Easy valAP73.2PVCNN
16kKITTI Cars Easy valAP84.02PVCNN

Related Papers

Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis2025-07-17A Survey of Deep Learning for Geometry Problem Solving2025-07-16Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15