Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts

Ji Hou, Benjamin Graham, Matthias Nießner, Saining Xie

2020-12-16CVPR 2021 1Scene Understanding Segmentation Semantic Segmentation Instance Segmentation 3D Semantic Segmentation

Paper PDF Code(official)Code

Abstract

The rapid progress in 3D scene understanding has come with growing demand for data; however, collecting and annotating 3D scenes (e.g. point clouds) are notoriously hard. For example, the number of scenes (e.g. indoor rooms) that can be accessed and scanned might be limited; even given sufficient data, acquiring 3D labels (e.g. instance masks) requires intensive human labor. In this paper, we explore data-efficient learning for 3D point cloud. As a first step towards this direction, we propose Contrastive Scene Contexts, a 3D pre-training method that makes use of both point-level correspondences and spatial contexts in a scene. Our method achieves state-of-the-art results on a suite of benchmarks where training data or labels are scarce. Our study reveals that exhaustive labelling of 3D point clouds might be unnecessary; and remarkably, on ScanNet, even using 0.1% of point labels, we still achieve 89% (instance segmentation) and 96% (semantic segmentation) of the baseline performance that uses full annotations.

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	S3DIS Area5	mIoU	72.2	CSC+MinkUNet
Semantic Segmentation	ScanNet200	test mIoU	24.9	CSC
Semantic Segmentation	ScanNet200	val mIoU	26.4	CSC
3D Semantic Segmentation	ScanNet200	test mIoU	24.9	CSC
3D Semantic Segmentation	ScanNet200	val mIoU	26.4	CSC
10-shot image generation	S3DIS Area5	mIoU	72.2	CSC+MinkUNet
10-shot image generation	ScanNet200	test mIoU	24.9	CSC
10-shot image generation	ScanNet200	val mIoU	26.4	CSC

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21 Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection2025-07-17 Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models2025-07-17 City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17 Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17 DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17 From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17 Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17