TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Masked Autoencoders for Point Cloud Self-supervised Learning

Masked Autoencoders for Point Cloud Self-supervised Learning

Yatian Pang, Wenxiao Wang, Francis E. H. Tay, Wei Liu, Yonghong Tian, Li Yuan

2022-03-13Few-Shot LearningFew-Shot 3D Point Cloud ClassificationPoint Cloud Segmentation3D Part Segmentation3D Point Cloud Classification
PaperPDFCodeCodeCodeCode(official)

Abstract

As a promising scheme of self-supervised learning, masked autoencoding has significantly advanced natural language processing and computer vision. Inspired by this, we propose a neat scheme of masked autoencoders for point cloud self-supervised learning, addressing the challenges posed by point cloud's properties, including leakage of location information and uneven information density. Concretely, we divide the input point cloud into irregular point patches and randomly mask them at a high ratio. Then, a standard Transformer based autoencoder, with an asymmetric design and a shifting mask tokens operation, learns high-level latent features from unmasked point patches, aiming to reconstruct the masked point patches. Extensive experiments show that our approach is efficient during pre-training and generalizes well on various downstream tasks. Specifically, our pre-trained models achieve 85.18% accuracy on ScanObjectNN and 94.04% accuracy on ModelNet40, outperforming all the other self-supervised learning methods. We show with our scheme, a simple architecture entirely based on standard Transformers can surpass dedicated Transformer models from supervised learning. Our approach also advances state-of-the-art accuracies by 1.5%-2.3% in the few-shot object classification. Furthermore, our work inspires the feasibility of applying unified architectures from languages and images to the point cloud.

Results

TaskDatasetMetricValueModel
Shape Representation Of 3D Point CloudsScanObjectNNOBJ-BG (OA)90.02Point-MAE
Shape Representation Of 3D Point CloudsScanObjectNNOBJ-ONLY (OA)88.29Point-MAE
Shape Representation Of 3D Point CloudsScanObjectNNOverall Accuracy85.2Point-MAE
Shape Representation Of 3D Point CloudsModelNet40Overall Accuracy94Point-MAE
Shape Representation Of 3D Point CloudsModelNet40 10-way (20-shot)Overall Accuracy95Point-MAE
Shape Representation Of 3D Point CloudsModelNet40 10-way (20-shot)Standard Deviation3Point-MAE
Shape Representation Of 3D Point CloudsModelNet40 5-way (10-shot)Overall Accuracy96.3Point-MAE
Shape Representation Of 3D Point CloudsModelNet40 5-way (10-shot)Standard Deviation2.5Point-MAE
Shape Representation Of 3D Point CloudsModelNet40 10-way (10-shot)Overall Accuracy92.6Point-MAE
Shape Representation Of 3D Point CloudsModelNet40 10-way (10-shot)Standard Deviation4.1Point-MAE
Shape Representation Of 3D Point CloudsModelNet40 5-way (20-shot)Overall Accuracy97.8Point-MAE
Shape Representation Of 3D Point CloudsModelNet40 5-way (20-shot)Standard Deviation1.8Point-MAE
3D Point Cloud ClassificationScanObjectNNOBJ-BG (OA)90.02Point-MAE
3D Point Cloud ClassificationScanObjectNNOBJ-ONLY (OA)88.29Point-MAE
3D Point Cloud ClassificationScanObjectNNOverall Accuracy85.2Point-MAE
3D Point Cloud ClassificationModelNet40Overall Accuracy94Point-MAE
3D Point Cloud ClassificationModelNet40 10-way (20-shot)Overall Accuracy95Point-MAE
3D Point Cloud ClassificationModelNet40 10-way (20-shot)Standard Deviation3Point-MAE
3D Point Cloud ClassificationModelNet40 5-way (10-shot)Overall Accuracy96.3Point-MAE
3D Point Cloud ClassificationModelNet40 5-way (10-shot)Standard Deviation2.5Point-MAE
3D Point Cloud ClassificationModelNet40 10-way (10-shot)Overall Accuracy92.6Point-MAE
3D Point Cloud ClassificationModelNet40 10-way (10-shot)Standard Deviation4.1Point-MAE
3D Point Cloud ClassificationModelNet40 5-way (20-shot)Overall Accuracy97.8Point-MAE
3D Point Cloud ClassificationModelNet40 5-way (20-shot)Standard Deviation1.8Point-MAE
Point Cloud SegmentationPointCloud-Cmean Corruption Error (mCE)0.927PointMAE
3D Point Cloud ReconstructionScanObjectNNOBJ-BG (OA)90.02Point-MAE
3D Point Cloud ReconstructionScanObjectNNOBJ-ONLY (OA)88.29Point-MAE
3D Point Cloud ReconstructionScanObjectNNOverall Accuracy85.2Point-MAE
3D Point Cloud ReconstructionModelNet40Overall Accuracy94Point-MAE
3D Point Cloud ReconstructionModelNet40 10-way (20-shot)Overall Accuracy95Point-MAE
3D Point Cloud ReconstructionModelNet40 10-way (20-shot)Standard Deviation3Point-MAE
3D Point Cloud ReconstructionModelNet40 5-way (10-shot)Overall Accuracy96.3Point-MAE
3D Point Cloud ReconstructionModelNet40 5-way (10-shot)Standard Deviation2.5Point-MAE
3D Point Cloud ReconstructionModelNet40 10-way (10-shot)Overall Accuracy92.6Point-MAE
3D Point Cloud ReconstructionModelNet40 10-way (10-shot)Standard Deviation4.1Point-MAE
3D Point Cloud ReconstructionModelNet40 5-way (20-shot)Overall Accuracy97.8Point-MAE
3D Point Cloud ReconstructionModelNet40 5-way (20-shot)Standard Deviation1.8Point-MAE

Related Papers

GLAD: Generalizable Tuning for Vision-Language Models2025-07-17Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection2025-07-10An Enhanced Privacy-preserving Federated Few-shot Learning Framework for Respiratory Disease Diagnosis2025-07-10Few-Shot Learning by Explicit Physics Integration: An Application to Groundwater Heat Transport2025-07-08ViRefSAM: Visual Reference-Guided Segment Anything Model for Remote Sensing Segmentation2025-07-03TSDASeg: A Two-Stage Model with Direct Alignment for Interactive Point Cloud Segmentation2025-06-26Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation Learning2025-06-26Dynamic Context-Aware Prompt Recommendation for Domain-Specific AI Applications2025-06-25