TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Point-GCC: Universal Self-supervised 3D Scene Pre-training...

Point-GCC: Universal Self-supervised 3D Scene Pre-training via Geometry-Color Contrast

Guofan Fan, Zekun Qi, Wenkai Shi, Kaisheng Ma

2023-05-31Deep Clustering3D Instance SegmentationScene UnderstandingTransfer LearningUnsupervised 3D Semantic Segmentation3D Semantic Segmentationobject-detection3D Object DetectionObject Detection
PaperPDFCode(official)

Abstract

Geometry and color information provided by the point clouds are both crucial for 3D scene understanding. Two pieces of information characterize the different aspects of point clouds, but existing methods lack an elaborate design for the discrimination and relevance. Hence we explore a 3D self-supervised paradigm that can better utilize the relations of point cloud information. Specifically, we propose a universal 3D scene pre-training framework via Geometry-Color Contrast (Point-GCC), which aligns geometry and color information using a Siamese network. To take care of actual application tasks, we design (i) hierarchical supervision with point-level contrast and reconstruct and object-level contrast based on the novel deep clustering module to close the gap between pre-training and downstream tasks; (ii) architecture-agnostic backbone to adapt for various downstream models. Benefiting from the object-level representation associated with downstream tasks, Point-GCC can directly evaluate model performance and the result demonstrates the effectiveness of our methods. Transfer learning results on a wide range of tasks also show consistent improvements across all datasets. e.g., new state-of-the-art object detection results on SUN RGB-D and S3DIS datasets. Codes will be released at https://github.com/Asterisci/Point-GCC.

Results

TaskDatasetMetricValueModel
Semantic SegmentationScanNetV2mIoU18.3Point-GCC+PointNet++
Object DetectionSUN-RGBD valmAP@0.2569.7Point-GCC+TR3D+FF
Object DetectionSUN-RGBD valmAP@0.554Point-GCC+TR3D+FF
Object DetectionSUN-RGBD valmAP@0.2567.7Point-GCC+TR3D
Object DetectionSUN-RGBD valmAP@0.551Point-GCC+TR3D
Object DetectionS3DISmAP@0.2575.1Point-GCC+TR3D
Object DetectionS3DISmAP@0.556.7Point-GCC+TR3D
Object DetectionScanNetV2mAP@0.2573.1Point-GCC+TR3D
Object DetectionScanNetV2mAP@0.559.6Point-GCC+TR3D
3DSUN-RGBD valmAP@0.2569.7Point-GCC+TR3D+FF
3DSUN-RGBD valmAP@0.554Point-GCC+TR3D+FF
3DSUN-RGBD valmAP@0.2567.7Point-GCC+TR3D
3DSUN-RGBD valmAP@0.551Point-GCC+TR3D
3DS3DISmAP@0.2575.1Point-GCC+TR3D
3DS3DISmAP@0.556.7Point-GCC+TR3D
3DScanNetV2mAP@0.2573.1Point-GCC+TR3D
3DScanNetV2mAP@0.559.6Point-GCC+TR3D
3D Semantic SegmentationScanNetV2mIoU18.3Point-GCC+PointNet++
3D Object DetectionSUN-RGBD valmAP@0.2569.7Point-GCC+TR3D+FF
3D Object DetectionSUN-RGBD valmAP@0.554Point-GCC+TR3D+FF
3D Object DetectionSUN-RGBD valmAP@0.2567.7Point-GCC+TR3D
3D Object DetectionSUN-RGBD valmAP@0.551Point-GCC+TR3D
3D Object DetectionS3DISmAP@0.2575.1Point-GCC+TR3D
3D Object DetectionS3DISmAP@0.556.7Point-GCC+TR3D
3D Object DetectionScanNetV2mAP@0.2573.1Point-GCC+TR3D
3D Object DetectionScanNetV2mAP@0.559.6Point-GCC+TR3D
2D ClassificationSUN-RGBD valmAP@0.2569.7Point-GCC+TR3D+FF
2D ClassificationSUN-RGBD valmAP@0.554Point-GCC+TR3D+FF
2D ClassificationSUN-RGBD valmAP@0.2567.7Point-GCC+TR3D
2D ClassificationSUN-RGBD valmAP@0.551Point-GCC+TR3D
2D ClassificationS3DISmAP@0.2575.1Point-GCC+TR3D
2D ClassificationS3DISmAP@0.556.7Point-GCC+TR3D
2D ClassificationScanNetV2mAP@0.2573.1Point-GCC+TR3D
2D ClassificationScanNetV2mAP@0.559.6Point-GCC+TR3D
2D Object DetectionSUN-RGBD valmAP@0.2569.7Point-GCC+TR3D+FF
2D Object DetectionSUN-RGBD valmAP@0.554Point-GCC+TR3D+FF
2D Object DetectionSUN-RGBD valmAP@0.2567.7Point-GCC+TR3D
2D Object DetectionSUN-RGBD valmAP@0.551Point-GCC+TR3D
2D Object DetectionS3DISmAP@0.2575.1Point-GCC+TR3D
2D Object DetectionS3DISmAP@0.556.7Point-GCC+TR3D
2D Object DetectionScanNetV2mAP@0.2573.1Point-GCC+TR3D
2D Object DetectionScanNetV2mAP@0.559.6Point-GCC+TR3D
10-shot image generationScanNetV2mIoU18.3Point-GCC+PointNet++
16kSUN-RGBD valmAP@0.2569.7Point-GCC+TR3D+FF
16kSUN-RGBD valmAP@0.554Point-GCC+TR3D+FF
16kSUN-RGBD valmAP@0.2567.7Point-GCC+TR3D
16kSUN-RGBD valmAP@0.551Point-GCC+TR3D
16kS3DISmAP@0.2575.1Point-GCC+TR3D
16kS3DISmAP@0.556.7Point-GCC+TR3D
16kScanNetV2mAP@0.2573.1Point-GCC+TR3D
16kScanNetV2mAP@0.559.6Point-GCC+TR3D

Related Papers

Tri-Learn Graph Fusion Network for Attributed Graph Clustering2025-07-18RaMen: Multi-Strategy Multi-Modal Learning for Bundle Construction2025-07-18Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection2025-07-17Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17