TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/HoHoNet: 360 Indoor Holistic Understanding with Latent Hor...

HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features

Cheng Sun, Min Sun, Hwann-Tzong Chen

2020-11-23CVPR 2021 13D Room Layouts From A Single RGB PanoramaSemantic SegmentationDepth Estimation
PaperPDFCode(official)

Abstract

We present HoHoNet, a versatile and efficient framework for holistic understanding of an indoor 360-degree panorama using a Latent Horizontal Feature (LHFeat). The compact LHFeat flattens the features along the vertical direction and has shown success in modeling per-column modality for room layout reconstruction. HoHoNet advances in two important aspects. First, the deep architecture is redesigned to run faster with improved accuracy. Second, we propose a novel horizon-to-dense module, which relaxes the per-column output shape constraint, allowing per-pixel dense prediction from LHFeat. HoHoNet is fast: It runs at 52 FPS and 110 FPS with ResNet-50 and ResNet-34 backbones respectively, for modeling dense modalities from a high-resolution $512 \times 1024$ panorama. HoHoNet is also accurate. On the tasks of layout estimation and semantic segmentation, HoHoNet achieves results on par with current state-of-the-art. On dense depth estimation, HoHoNet outperforms all the prior arts by a large margin.

Results

TaskDatasetMetricValueModel
Depth EstimationStanford2D3D PanoramicRMSE0.3834HoHoNet (ResNet-101)
Depth EstimationStanford2D3D Panoramicabsolute relative error0.1014HoHoNet (ResNet-101)
3D ReconstructionStanford2D3D Panoramic3DIoU79.88HoHoNet (ResNet-101)
Scene ParsingStanford2D3D Panoramic3DIoU79.88HoHoNet (ResNet-101)
Semantic SegmentationStanford2D3D Panoramic - RGBDmAcc68.9HoHoNet (ResNet-101)
Semantic SegmentationStanford2D3D Panoramic - RGBDmIoU56.3HoHoNet (ResNet-101)
Semantic SegmentationStanford2D3D PanoramicmAcc65HoHoNet (ResNet-101)
3DStanford2D3D PanoramicRMSE0.3834HoHoNet (ResNet-101)
3DStanford2D3D Panoramicabsolute relative error0.1014HoHoNet (ResNet-101)
3DStanford2D3D Panoramic3DIoU79.88HoHoNet (ResNet-101)
Scene UnderstandingStanford2D3D Panoramic3DIoU79.88HoHoNet (ResNet-101)
2D Semantic SegmentationStanford2D3D Panoramic3DIoU79.88HoHoNet (ResNet-101)
10-shot image generationStanford2D3D Panoramic - RGBDmAcc68.9HoHoNet (ResNet-101)
10-shot image generationStanford2D3D Panoramic - RGBDmIoU56.3HoHoNet (ResNet-101)
10-shot image generationStanford2D3D PanoramicmAcc65HoHoNet (ResNet-101)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17$S^2M^2$: Scalable Stereo Matching Model for Reliable Depth Estimation2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17SAMST: A Transformer framework based on SAM pseudo label filtering for remote sensing semi-supervised semantic segmentation2025-07-16