TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Semantic Scene Completion from a Single Depth Image

Semantic Scene Completion from a Single Depth Image

Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang, Manolis Savva, Thomas Funkhouser

2016-11-28CVPR 2017 73D Semantic Scene Completion
PaperPDFCodeCode(official)Code

Abstract

This paper focuses on semantic scene completion, a task for producing a complete 3D voxel representation of volumetric occupancy and semantic labels for a scene from a single-view depth map observation. Previous work has considered scene completion and semantic labeling of depth maps separately. However, we observe that these two problems are tightly intertwined. To leverage the coupled nature of these two tasks, we introduce the semantic scene completion network (SSCNet), an end-to-end 3D convolutional network that takes a single depth image as input and simultaneously outputs occupancy and semantic labels for all voxels in the camera view frustum. Our network uses a dilation-based 3D context module to efficiently expand the receptive field and enable 3D context learning. To train our network, we construct SUNCG - a manually created large-scale dataset of synthetic 3D scenes with dense volumetric annotations. Our experiments demonstrate that the joint model outperforms methods addressing each task in isolation and outperforms alternative approaches on the semantic scene completion task.

Results

TaskDatasetMetricValueModel
3D ReconstructionNYUv2mIoU30.5SSCNet (SUNCG pretraining)
3D ReconstructionNYUv2mIoU24.7SSCNet
3D ReconstructionSemanticKITTImIoU16.1SSCNet (reported in LMSCNet)
3D ReconstructionSemanticKITTImIoU16.1SSCNet-full (reported in LMSCNet)
3D ReconstructionKITTI-360mIoU16.95SSCNet
3DNYUv2mIoU30.5SSCNet (SUNCG pretraining)
3DNYUv2mIoU24.7SSCNet
3DSemanticKITTImIoU16.1SSCNet (reported in LMSCNet)
3DSemanticKITTImIoU16.1SSCNet-full (reported in LMSCNet)
3DKITTI-360mIoU16.95SSCNet
3D Semantic Scene CompletionNYUv2mIoU30.5SSCNet (SUNCG pretraining)
3D Semantic Scene CompletionNYUv2mIoU24.7SSCNet
3D Semantic Scene CompletionSemanticKITTImIoU16.1SSCNet (reported in LMSCNet)
3D Semantic Scene CompletionSemanticKITTImIoU16.1SSCNet-full (reported in LMSCNet)
3D Semantic Scene CompletionKITTI-360mIoU16.95SSCNet

Related Papers

Disentangling Instance and Scene Contexts for 3D Semantic Scene Completion2025-07-11Camera-Only 3D Panoptic Scene Completion for Autonomous Driving through Differentiable Object Shapes2025-05-14SGFormer: Satellite-Ground Fusion for 3D Semantic Scene Completion2025-03-21VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion2025-03-08Vision-based 3D Semantic Scene Completion via Capture Dynamic Representations2025-03-08Learning Temporal 3D Semantic Scene Completion via Optical Flow Guidance2025-02-20Skip Mamba Diffusion for Monocular 3D Semantic Scene Completion2025-01-13SOAP: Vision-Centric 3D Semantic Scene Completion with Scene-Adaptive Decoder and Occluded Region-Aware View Projection2025-01-01