HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features

Cheng Sun, Min Sun, Hwann-Tzong Chen

2020-11-23CVPR 2021 13D Room Layouts From A Single RGB Panorama Semantic Segmentation Depth Estimation

Abstract

We present HoHoNet, a versatile and efficient framework for holistic understanding of an indoor 360-degree panorama using a Latent Horizontal Feature (LHFeat). The compact LHFeat flattens the features along the vertical direction and has shown success in modeling per-column modality for room layout reconstruction. HoHoNet advances in two important aspects. First, the deep architecture is redesigned to run faster with improved accuracy. Second, we propose a novel horizon-to-dense module, which relaxes the per-column output shape constraint, allowing per-pixel dense prediction from LHFeat. HoHoNet is fast: It runs at 52 FPS and 110 FPS with ResNet-50 and ResNet-34 backbones respectively, for modeling dense modalities from a high-resolution $512 \times 1024$ panorama. HoHoNet is also accurate. On the tasks of layout estimation and semantic segmentation, HoHoNet achieves results on par with current state-of-the-art. On dense depth estimation, HoHoNet outperforms all the prior arts by a large margin.

Results

Task	Dataset	Metric	Value	Model
Depth Estimation	Stanford2D3D Panoramic	RMSE	0.3834	HoHoNet (ResNet-101)
Depth Estimation	Stanford2D3D Panoramic	absolute relative error	0.1014	HoHoNet (ResNet-101)
3D Reconstruction	Stanford2D3D Panoramic	3DIoU	79.88	HoHoNet (ResNet-101)
Scene Parsing	Stanford2D3D Panoramic	3DIoU	79.88	HoHoNet (ResNet-101)
Semantic Segmentation	Stanford2D3D Panoramic - RGBD	mAcc	68.9	HoHoNet (ResNet-101)
Semantic Segmentation	Stanford2D3D Panoramic - RGBD	mIoU	56.3	HoHoNet (ResNet-101)
Semantic Segmentation	Stanford2D3D Panoramic	mAcc	65	HoHoNet (ResNet-101)
3D	Stanford2D3D Panoramic	RMSE	0.3834	HoHoNet (ResNet-101)
3D	Stanford2D3D Panoramic	absolute relative error	0.1014	HoHoNet (ResNet-101)
3D	Stanford2D3D Panoramic	3DIoU	79.88	HoHoNet (ResNet-101)
Scene Understanding	Stanford2D3D Panoramic	3DIoU	79.88	HoHoNet (ResNet-101)
2D Semantic Segmentation	Stanford2D3D Panoramic	3DIoU	79.88	HoHoNet (ResNet-101)
10-shot image generation	Stanford2D3D Panoramic - RGBD	mAcc	68.9	HoHoNet (ResNet-101)
10-shot image generation	Stanford2D3D Panoramic - RGBD	mIoU	56.3	HoHoNet (ResNet-101)
10-shot image generation	Stanford2D3D Panoramic	mAcc	65	HoHoNet (ResNet-101)

HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features

Abstract

Results

Related Papers

HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features

Abstract

Results

Related Papers