TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MonoScene: Monocular 3D Semantic Scene Completion

MonoScene: Monocular 3D Semantic Scene Completion

Anh-Quan Cao, Raoul de Charette

2021-12-01CVPR 2022 13D Scene Reconstruction3D ReconstructionSingle-View 3D Reconstruction3D Semantic Scene Completion from a single RGB image3D Semantic Scene Completion
PaperPDFCode(official)Code

Abstract

MonoScene proposes a 3D Semantic Scene Completion (SSC) framework, where the dense geometry and semantics of a scene are inferred from a single monocular RGB image. Different from the SSC literature, relying on 2.5 or 3D input, we solve the complex problem of 2D to 3D scene reconstruction while jointly inferring its semantics. Our framework relies on successive 2D and 3D UNets bridged by a novel 2D-3D features projection inspiring from optics and introduces a 3D context relation prior to enforce spatio-semantic consistency. Along with architectural contributions, we introduce novel global scene and local frustums losses. Experiments show we outperform the literature on all metrics and datasets while hallucinating plausible scenery even beyond the camera field of view. Our code and trained models are available at https://github.com/cv-rits/MonoScene.

Results

TaskDatasetMetricValueModel
ReconstructionKITTI-360mIoU12.31MonoScene
ReconstructionNYUv2mIoU26.94MonoScene
ReconstructionSemanticKITTImIoU11.08MonoScene
3D ReconstructionNYUv2mIoU26.94MonoScene (RGB input only)
3D ReconstructionSemanticKITTImIoU11.08MonoScene (RGB input only)
3D ReconstructionKITTI-360mIoU12.31MonoScene
3D ReconstructionKITTI-360mIoU12.31MonoScene
3D ReconstructionNYUv2mIoU26.94MonoScene
3D ReconstructionSemanticKITTImIoU11.08MonoScene
3DNYUv2mIoU26.94MonoScene (RGB input only)
3DSemanticKITTImIoU11.08MonoScene (RGB input only)
3DKITTI-360mIoU12.31MonoScene
3DKITTI-360mIoU12.31MonoScene
3DNYUv2mIoU26.94MonoScene
3DSemanticKITTImIoU11.08MonoScene
3D Semantic Scene CompletionNYUv2mIoU26.94MonoScene (RGB input only)
3D Semantic Scene CompletionSemanticKITTImIoU11.08MonoScene (RGB input only)
3D Semantic Scene CompletionKITTI-360mIoU12.31MonoScene
3D Semantic Scene CompletionKITTI-360mIoU12.31MonoScene
3D Semantic Scene CompletionNYUv2mIoU26.94MonoScene
3D Semantic Scene CompletionSemanticKITTImIoU11.08MonoScene
3D Scene ReconstructionKITTI-360mIoU12.31MonoScene
3D Scene ReconstructionNYUv2mIoU26.94MonoScene
3D Scene ReconstructionSemanticKITTImIoU11.08MonoScene
Single-View 3D ReconstructionKITTI-360mIoU12.31MonoScene
Single-View 3D ReconstructionNYUv2mIoU26.94MonoScene
Single-View 3D ReconstructionSemanticKITTImIoU11.08MonoScene

Related Papers

AutoPartGen: Autogressive 3D Part Generation and Discovery2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16BRUM: Robust 3D Vehicle Reconstruction from 360 Sparse Images2025-07-16Physically Based Neural LiDAR Resimulation2025-07-15Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation2025-07-15Binomial Self-Compensation: Mechanism and Suppression of Motion Error in Phase-Shifting Profilometry2025-07-14An Efficient Approach for Muscle Segmentation and 3D Reconstruction Using Keypoint Tracking in MRI Scan2025-07-11Review of Feed-forward 3D Reconstruction: From DUSt3R to VGGT2025-07-11