TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/OccDepth: A Depth-Aware Method for 3D Semantic Scene Compl...

OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion

Ruihang Miao, Weizhou Liu, Mingrui Chen, Zheng Gong, Weixin Xu, Chen Hu, Shuchang Zhou

2023-02-273D geometryAutonomous DrivingKnowledge Distillation3D Semantic Scene Completion
PaperPDFCode(official)

Abstract

3D Semantic Scene Completion (SSC) can provide dense geometric and semantic scene representations, which can be applied in the field of autonomous driving and robotic systems. It is challenging to estimate the complete geometry and semantics of a scene solely from visual images, and accurate depth information is crucial for restoring 3D geometry. In this paper, we propose the first stereo SSC method named OccDepth, which fully exploits implicit depth information from stereo images (or RGBD images) to help the recovery of 3D geometric structures. The Stereo Soft Feature Assignment (Stereo-SFA) module is proposed to better fuse 3D depth-aware features by implicitly learning the correlation between stereo images. In particular, when the input are RGBD image, a virtual stereo images can be generated through original RGB image and depth map. Besides, the Occupancy Aware Depth (OAD) module is used to obtain geometry-aware 3D features by knowledge distillation using pre-trained depth models. In addition, a reformed TartanAir benchmark, named SemanticTartanAir, is provided in this paper for further testing our OccDepth method on SSC task. Compared with the state-of-the-art RGB-inferred SSC method, extensive experiments on SemanticKITTI show that our OccDepth method achieves superior performance with improving +4.82% mIoU, of which +2.49% mIoU comes from stereo images and +2.33% mIoU comes from our proposed depth-aware method. Our code and trained models are available at https://github.com/megvii-research/OccDepth.

Results

TaskDatasetMetricValueModel
3D ReconstructionNYUv2mIoU30.9OccDepth
3D ReconstructionSemanticKITTImIoU15.9OccDepth(RGB input only)
3DNYUv2mIoU30.9OccDepth
3DSemanticKITTImIoU15.9OccDepth(RGB input only)
3D Semantic Scene CompletionNYUv2mIoU30.9OccDepth
3D Semantic Scene CompletionSemanticKITTImIoU15.9OccDepth(RGB input only)

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving2025-07-19AGENTS-LLM: Augmentative GENeration of Challenging Traffic Scenarios with an Agentic LLM Framework2025-07-18World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models2025-07-17Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17LaViPlan : Language-Guided Visual Path Planning with RLVR2025-07-17Uncertainty-Aware Cross-Modal Knowledge Distillation with Prototype Learning for Multimodal Brain-Computer Interfaces2025-07-17