Point-Voxel CNN for Efficient 3D Deep Learning

Zhijian Liu, Haotian Tang, Yujun Lin, Song Han

2019-07-08NeurIPS 2019 12Scene Segmentation Deep Learning 3D Semantic Segmentation object-detection 3D Object Detection Object Detection

Paper PDF Code Code(official)Code Code

Abstract

We present Point-Voxel CNN (PVCNN) for efficient, fast 3D deep learning. Previous work processes 3D data using either voxel-based or point-based NN models. However, both approaches are computationally inefficient. The computation cost and memory footprints of the voxel-based models grow cubically with the input resolution, making it memory-prohibitive to scale up the resolution. As for point-based networks, up to 80% of the time is wasted on structuring the sparse data which have rather poor memory locality, not on the actual feature extraction. In this paper, we propose PVCNN that represents the 3D input data in points to reduce the memory consumption, while performing the convolutions in voxels to reduce the irregular, sparse data access and improve the locality. Our PVCNN model is both memory and computation efficient. Evaluated on semantic and part segmentation datasets, it achieves much higher accuracy than the voxel-based baseline with 10x GPU memory reduction; it also outperforms the state-of-the-art point-based models with 7x measured speedup on average. Remarkably, the narrower version of PVCNN achieves 2x speedup over PointNet (an extremely efficient model) on part and scene segmentation benchmarks with much higher accuracy. We validate the general effectiveness of PVCNN on 3D object detection: by replacing the primitives in Frustrum PointNet with PVConv, it outperforms Frustrum PointNet++ by 2.4% mAP on average with 1.5x measured speedup and GPU memory reduction.

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	S3DIS	mIoU (6-Fold)	58.98	PVCNN++
Semantic Segmentation	ShapeNet-Part	Instance Average IoU	86.2	PVCNN volumetric
Object Detection	KITTI Cars Hard val	AP	63.81	PVCNN
Object Detection	KITTI Cyclist Moderate val	AP	59.97	PVCNN
Object Detection	KITTI Cyclist Easy val	AP	81.4	PVCNN
Object Detection	KITTI Cyclist Hard val	AP	56.24	PVCNN
Object Detection	KITTI Pedestrian Moderate val	AP	64.71	PVCNN
Object Detection	KITTI Pedestrian Hard val	AP	56.78	PVCNN
Object Detection	KITTI Cars Moderate val	AP	71.54	PVCNN
Object Detection	KITTI Pedestrian Easy val	AP	73.2	PVCNN
Object Detection	KITTI Cars Easy val	AP	84.02	PVCNN
3D	KITTI Cars Hard val	AP	63.81	PVCNN
3D	KITTI Cyclist Moderate val	AP	59.97	PVCNN
3D	KITTI Cyclist Easy val	AP	81.4	PVCNN
3D	KITTI Cyclist Hard val	AP	56.24	PVCNN
3D	KITTI Pedestrian Moderate val	AP	64.71	PVCNN
3D	KITTI Pedestrian Hard val	AP	56.78	PVCNN
3D	KITTI Cars Moderate val	AP	71.54	PVCNN
3D	KITTI Pedestrian Easy val	AP	73.2	PVCNN
3D	KITTI Cars Easy val	AP	84.02	PVCNN
3D Semantic Segmentation	S3DIS	mIoU (6-Fold)	58.98	PVCNN++
3D Object Detection	KITTI Cars Hard val	AP	63.81	PVCNN
3D Object Detection	KITTI Cyclist Moderate val	AP	59.97	PVCNN
3D Object Detection	KITTI Cyclist Easy val	AP	81.4	PVCNN
3D Object Detection	KITTI Cyclist Hard val	AP	56.24	PVCNN
3D Object Detection	KITTI Pedestrian Moderate val	AP	64.71	PVCNN
3D Object Detection	KITTI Pedestrian Hard val	AP	56.78	PVCNN
3D Object Detection	KITTI Cars Moderate val	AP	71.54	PVCNN
3D Object Detection	KITTI Pedestrian Easy val	AP	73.2	PVCNN
3D Object Detection	KITTI Cars Easy val	AP	84.02	PVCNN
2D Classification	KITTI Cars Hard val	AP	63.81	PVCNN
2D Classification	KITTI Cyclist Moderate val	AP	59.97	PVCNN
2D Classification	KITTI Cyclist Easy val	AP	81.4	PVCNN
2D Classification	KITTI Cyclist Hard val	AP	56.24	PVCNN
2D Classification	KITTI Pedestrian Moderate val	AP	64.71	PVCNN
2D Classification	KITTI Pedestrian Hard val	AP	56.78	PVCNN
2D Classification	KITTI Cars Moderate val	AP	71.54	PVCNN
2D Classification	KITTI Pedestrian Easy val	AP	73.2	PVCNN
2D Classification	KITTI Cars Easy val	AP	84.02	PVCNN
2D Object Detection	KITTI Cars Hard val	AP	63.81	PVCNN
2D Object Detection	KITTI Cyclist Moderate val	AP	59.97	PVCNN
2D Object Detection	KITTI Cyclist Easy val	AP	81.4	PVCNN
2D Object Detection	KITTI Cyclist Hard val	AP	56.24	PVCNN
2D Object Detection	KITTI Pedestrian Moderate val	AP	64.71	PVCNN
2D Object Detection	KITTI Pedestrian Hard val	AP	56.78	PVCNN
2D Object Detection	KITTI Cars Moderate val	AP	71.54	PVCNN
2D Object Detection	KITTI Pedestrian Easy val	AP	73.2	PVCNN
2D Object Detection	KITTI Cars Easy val	AP	84.02	PVCNN
10-shot image generation	S3DIS	mIoU (6-Fold)	58.98	PVCNN++
10-shot image generation	ShapeNet-Part	Instance Average IoU	86.2	PVCNN volumetric
16k	KITTI Cars Hard val	AP	63.81	PVCNN
16k	KITTI Cyclist Moderate val	AP	59.97	PVCNN
16k	KITTI Cyclist Easy val	AP	81.4	PVCNN
16k	KITTI Cyclist Hard val	AP	56.24	PVCNN
16k	KITTI Pedestrian Moderate val	AP	64.71	PVCNN
16k	KITTI Pedestrian Hard val	AP	56.78	PVCNN
16k	KITTI Cars Moderate val	AP	71.54	PVCNN
16k	KITTI Pedestrian Easy val	AP	73.2	PVCNN
16k	KITTI Cars Easy val	AP	84.02	PVCNN

Point-Voxel CNN for Efficient 3D Deep Learning

Abstract

Results

Related Papers

Point-Voxel CNN for Efficient 3D Deep Learning

Abstract

Results

Related Papers