TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/LayoutNet: Reconstructing the 3D Room Layout from a Single...

LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image

Chuhang Zou, Alex Colburn, Qi Shan, Derek Hoiem

2018-03-23CVPR 2018 63D Room Layouts From A Single RGB PanoramaTranslation
PaperPDFCode(official)Code

Abstract

We propose an algorithm to predict room layout from a single image that generalizes across panoramas and perspective images, cuboid layouts and more general layouts (e.g. L-shape room). Our method operates directly on the panoramic image, rather than decomposing into perspective images as do recent works. Our network architecture is similar to that of RoomNet, but we show improvements due to aligning the image based on vanishing points, predicting multiple layout elements (corners, boundaries, size and translation), and fitting a constrained Manhattan layout to the resulting predictions. Our method compares well in speed and accuracy to other existing work on panoramas, achieves among the best accuracy for perspective images, and can handle both cuboid-shaped and more general Manhattan layouts.

Results

TaskDatasetMetricValueModel
3D ReconstructionStanford2D3D Panoramic3DIoU76.33LayoutNet
3D ReconstructionStanford2D3D PanoramicCorner Error1.04LayoutNet
3D ReconstructionStanford2D3D PanoramicPixel Error2.7LayoutNet
3D ReconstructionPanoContext3DIoU74.48LayoutNet
Scene ParsingStanford2D3D Panoramic3DIoU76.33LayoutNet
Scene ParsingStanford2D3D PanoramicCorner Error1.04LayoutNet
Scene ParsingStanford2D3D PanoramicPixel Error2.7LayoutNet
Scene ParsingPanoContext3DIoU74.48LayoutNet
3DStanford2D3D Panoramic3DIoU76.33LayoutNet
3DStanford2D3D PanoramicCorner Error1.04LayoutNet
3DStanford2D3D PanoramicPixel Error2.7LayoutNet
3DPanoContext3DIoU74.48LayoutNet
Scene UnderstandingStanford2D3D Panoramic3DIoU76.33LayoutNet
Scene UnderstandingStanford2D3D PanoramicCorner Error1.04LayoutNet
Scene UnderstandingStanford2D3D PanoramicPixel Error2.7LayoutNet
Scene UnderstandingPanoContext3DIoU74.48LayoutNet
2D Semantic SegmentationStanford2D3D Panoramic3DIoU76.33LayoutNet
2D Semantic SegmentationStanford2D3D PanoramicCorner Error1.04LayoutNet
2D Semantic SegmentationStanford2D3D PanoramicPixel Error2.7LayoutNet
2D Semantic SegmentationPanoContext3DIoU74.48LayoutNet

Related Papers

A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17Function-to-Style Guidance of LLMs for Code Translation2025-07-15Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation2025-07-09Pun Intended: Multi-Agent Translation of Wordplay with Contrastive Learning and Phonetic-Semantic Embeddings2025-07-09Unconditional Diffusion for Generative Sequential Recommendation2025-07-08GRAFT: A Graph-based Flow-aware Agentic Framework for Document-level Machine Translation2025-07-04TransLaw: Benchmarking Large Language Models in Multi-Agent Simulation of the Collaborative Translation2025-07-01CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation2025-06-29