Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation

Damien Robert, Bruno Vallet, Loic Landrieu

2022-04-15CVPR 2022 1Multimodal Deep Learning Semantic Segmentation Colorization 3D Semantic Segmentation

Abstract

Recent works on 3D semantic segmentation propose to exploit the synergy between images and point clouds by processing each modality with a dedicated network and projecting learned 2D features onto 3D points. Merging large-scale point clouds and images raises several challenges, such as constructing a mapping between points and pixels, and aggregating features between multiple views. Current methods require mesh reconstruction or specialized sensors to recover occlusions, and use heuristics to select and aggregate available images. In contrast, we propose an end-to-end trainable multi-view aggregation model leveraging the viewing conditions of 3D points to merge features from images taken at arbitrary positions. Our method can combine standard 2D and 3D networks and outperforms both 3D models operating on colorized point clouds and hybrid 2D/3D networks without requiring colorization, meshing, or true depth maps. We set a new state-of-the-art for large-scale indoor/outdoor semantic segmentation on S3DIS (74.7 mIoU 6-Fold) and on KITTI-360 (58.3 mIoU). Our full pipeline is accessible at https://github.com/drprojects/DeepViewAgg, and only requires raw 3D scans and a set of images and poses.

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	S3DIS	Mean IoU	74.7	DeepViewAgg
Semantic Segmentation	S3DIS	Params (M)	41.2	DeepViewAgg
Semantic Segmentation	S3DIS	mAcc	83.8	DeepViewAgg
Semantic Segmentation	S3DIS	oAcc	90.1	DeepViewAgg
Semantic Segmentation	KITTI-360	mIoU Category	73.66	DeepViewAgg
Semantic Segmentation	KITTI-360	miou	58.3	DeepViewAgg
Semantic Segmentation	KITTI-360	miou Val	57.8	DeepViewAgg
Semantic Segmentation	KITTI-360	mIoU Category	74.08	MinkowskiNet
Semantic Segmentation	KITTI-360	miou	53.92	MinkowskiNet
Semantic Segmentation	KITTI-360	miou Val	54.2	MinkowskiNet
3D Semantic Segmentation	KITTI-360	mIoU Category	73.66	DeepViewAgg
3D Semantic Segmentation	KITTI-360	miou	58.3	DeepViewAgg
3D Semantic Segmentation	KITTI-360	miou Val	57.8	DeepViewAgg
3D Semantic Segmentation	KITTI-360	mIoU Category	74.08	MinkowskiNet
3D Semantic Segmentation	KITTI-360	miou	53.92	MinkowskiNet
3D Semantic Segmentation	KITTI-360	miou Val	54.2	MinkowskiNet
10-shot image generation	S3DIS	Mean IoU	74.7	DeepViewAgg
10-shot image generation	S3DIS	Params (M)	41.2	DeepViewAgg
10-shot image generation	S3DIS	mAcc	83.8	DeepViewAgg
10-shot image generation	S3DIS	oAcc	90.1	DeepViewAgg
10-shot image generation	KITTI-360	mIoU Category	73.66	DeepViewAgg
10-shot image generation	KITTI-360	miou	58.3	DeepViewAgg
10-shot image generation	KITTI-360	miou Val	57.8	DeepViewAgg
10-shot image generation	KITTI-360	mIoU Category	74.08	MinkowskiNet
10-shot image generation	KITTI-360	miou	53.92	MinkowskiNet
10-shot image generation	KITTI-360	miou Val	54.2	MinkowskiNet

Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation

Abstract

Results

Related Papers

Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation

Abstract

Results

Related Papers