Masked Autoencoders for Point Cloud Self-supervised Learning

Yatian Pang, Wenxiao Wang, Francis E. H. Tay, Wei Liu, Yonghong Tian, Li Yuan

2022-03-13Few-Shot Learning Few-Shot 3D Point Cloud Classification Point Cloud Segmentation 3D Part Segmentation 3D Point Cloud Classification

Paper PDF Code Code Code Code(official)

Abstract

As a promising scheme of self-supervised learning, masked autoencoding has significantly advanced natural language processing and computer vision. Inspired by this, we propose a neat scheme of masked autoencoders for point cloud self-supervised learning, addressing the challenges posed by point cloud's properties, including leakage of location information and uneven information density. Concretely, we divide the input point cloud into irregular point patches and randomly mask them at a high ratio. Then, a standard Transformer based autoencoder, with an asymmetric design and a shifting mask tokens operation, learns high-level latent features from unmasked point patches, aiming to reconstruct the masked point patches. Extensive experiments show that our approach is efficient during pre-training and generalizes well on various downstream tasks. Specifically, our pre-trained models achieve 85.18% accuracy on ScanObjectNN and 94.04% accuracy on ModelNet40, outperforming all the other self-supervised learning methods. We show with our scheme, a simple architecture entirely based on standard Transformers can surpass dedicated Transformer models from supervised learning. Our approach also advances state-of-the-art accuracies by 1.5%-2.3% in the few-shot object classification. Furthermore, our work inspires the feasibility of applying unified architectures from languages and images to the point cloud.

Results

Task	Dataset	Metric	Value	Model
Shape Representation Of 3D Point Clouds	ScanObjectNN	OBJ-BG (OA)	90.02	Point-MAE
Shape Representation Of 3D Point Clouds	ScanObjectNN	OBJ-ONLY (OA)	88.29	Point-MAE
Shape Representation Of 3D Point Clouds	ScanObjectNN	Overall Accuracy	85.2	Point-MAE
Shape Representation Of 3D Point Clouds	ModelNet40	Overall Accuracy	94	Point-MAE
Shape Representation Of 3D Point Clouds	ModelNet40 10-way (20-shot)	Overall Accuracy	95	Point-MAE
Shape Representation Of 3D Point Clouds	ModelNet40 10-way (20-shot)	Standard Deviation	3	Point-MAE
Shape Representation Of 3D Point Clouds	ModelNet40 5-way (10-shot)	Overall Accuracy	96.3	Point-MAE
Shape Representation Of 3D Point Clouds	ModelNet40 5-way (10-shot)	Standard Deviation	2.5	Point-MAE
Shape Representation Of 3D Point Clouds	ModelNet40 10-way (10-shot)	Overall Accuracy	92.6	Point-MAE
Shape Representation Of 3D Point Clouds	ModelNet40 10-way (10-shot)	Standard Deviation	4.1	Point-MAE
Shape Representation Of 3D Point Clouds	ModelNet40 5-way (20-shot)	Overall Accuracy	97.8	Point-MAE
Shape Representation Of 3D Point Clouds	ModelNet40 5-way (20-shot)	Standard Deviation	1.8	Point-MAE
3D Point Cloud Classification	ScanObjectNN	OBJ-BG (OA)	90.02	Point-MAE
3D Point Cloud Classification	ScanObjectNN	OBJ-ONLY (OA)	88.29	Point-MAE
3D Point Cloud Classification	ScanObjectNN	Overall Accuracy	85.2	Point-MAE
3D Point Cloud Classification	ModelNet40	Overall Accuracy	94	Point-MAE
3D Point Cloud Classification	ModelNet40 10-way (20-shot)	Overall Accuracy	95	Point-MAE
3D Point Cloud Classification	ModelNet40 10-way (20-shot)	Standard Deviation	3	Point-MAE
3D Point Cloud Classification	ModelNet40 5-way (10-shot)	Overall Accuracy	96.3	Point-MAE
3D Point Cloud Classification	ModelNet40 5-way (10-shot)	Standard Deviation	2.5	Point-MAE
3D Point Cloud Classification	ModelNet40 10-way (10-shot)	Overall Accuracy	92.6	Point-MAE
3D Point Cloud Classification	ModelNet40 10-way (10-shot)	Standard Deviation	4.1	Point-MAE
3D Point Cloud Classification	ModelNet40 5-way (20-shot)	Overall Accuracy	97.8	Point-MAE
3D Point Cloud Classification	ModelNet40 5-way (20-shot)	Standard Deviation	1.8	Point-MAE
Point Cloud Segmentation	PointCloud-C	mean Corruption Error (mCE)	0.927	PointMAE
3D Point Cloud Reconstruction	ScanObjectNN	OBJ-BG (OA)	90.02	Point-MAE
3D Point Cloud Reconstruction	ScanObjectNN	OBJ-ONLY (OA)	88.29	Point-MAE
3D Point Cloud Reconstruction	ScanObjectNN	Overall Accuracy	85.2	Point-MAE
3D Point Cloud Reconstruction	ModelNet40	Overall Accuracy	94	Point-MAE
3D Point Cloud Reconstruction	ModelNet40 10-way (20-shot)	Overall Accuracy	95	Point-MAE
3D Point Cloud Reconstruction	ModelNet40 10-way (20-shot)	Standard Deviation	3	Point-MAE
3D Point Cloud Reconstruction	ModelNet40 5-way (10-shot)	Overall Accuracy	96.3	Point-MAE
3D Point Cloud Reconstruction	ModelNet40 5-way (10-shot)	Standard Deviation	2.5	Point-MAE
3D Point Cloud Reconstruction	ModelNet40 10-way (10-shot)	Overall Accuracy	92.6	Point-MAE
3D Point Cloud Reconstruction	ModelNet40 10-way (10-shot)	Standard Deviation	4.1	Point-MAE
3D Point Cloud Reconstruction	ModelNet40 5-way (20-shot)	Overall Accuracy	97.8	Point-MAE
3D Point Cloud Reconstruction	ModelNet40 5-way (20-shot)	Standard Deviation	1.8	Point-MAE

Masked Autoencoders for Point Cloud Self-supervised Learning

Abstract

Results

Related Papers

Masked Autoencoders for Point Cloud Self-supervised Learning

Abstract

Results

Related Papers