TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/PCP-MAE: Learning to Predict Centers for Point Masked Auto...

PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders

Xiangdong Zhang, Shaofeng Zhang, Junchi Yan

2024-08-16Learning Semantic RepresentationsFew-Shot LearningSelf-Supervised LearningFew-Shot 3D Point Cloud Classification3D Object Classification3D Point Cloud Classification
PaperPDFCode(official)

Abstract

Masked autoencoder has been widely explored in point cloud self-supervised learning, whereby the point cloud is generally divided into visible and masked parts. These methods typically include an encoder accepting visible patches (normalized) and corresponding patch centers (position) as input, with the decoder accepting the output of the encoder and the centers (position) of the masked parts to reconstruct each point in the masked patches. Then, the pre-trained encoders are used for downstream tasks. In this paper, we show a motivating empirical result that when directly feeding the centers of masked patches to the decoder without information from the encoder, it still reconstructs well. In other words, the centers of patches are important and the reconstruction objective does not necessarily rely on representations of the encoder, thus preventing the encoder from learning semantic representations. Based on this key observation, we propose a simple yet effective method, i.e., learning to Predict Centers for Point Masked AutoEncoders (PCP-MAE) which guides the model to learn to predict the significant centers and use the predicted centers to replace the directly provided centers. Specifically, we propose a Predicting Center Module (PCM) that shares parameters with the original encoder with extra cross-attention to predict centers. Our method is of high pre-training efficiency compared to other alternatives and achieves great improvement over Point-MAE, particularly surpassing it by 5.50% on OBJ-BG, 6.03% on OBJ-ONLY, and 5.17% on PB-T50-RS for 3D object classification on the ScanObjectNN dataset. The code is available at https://github.com/aHapBean/PCP-MAE.

Results

TaskDatasetMetricValueModel
Shape Representation Of 3D Point CloudsScanObjectNNOBJ-BG (OA)95.52PCP-MAE
Shape Representation Of 3D Point CloudsScanObjectNNOBJ-ONLY (OA)94.32PCP-MAE
Shape Representation Of 3D Point CloudsScanObjectNNOverall Accuracy90.35PCP-MAE
Shape Representation Of 3D Point CloudsModelNet40Overall Accuracy94.2PCP-MAE
Shape Representation Of 3D Point CloudsModelNet40 10-way (20-shot)Overall Accuracy95.9PCP-MAE
Shape Representation Of 3D Point CloudsModelNet40 10-way (20-shot)Standard Deviation2.7PCP-MAE
Shape Representation Of 3D Point CloudsModelNet40 5-way (10-shot)Overall Accuracy97.4PCP-MAE
Shape Representation Of 3D Point CloudsModelNet40 5-way (10-shot)Standard Deviation2.3PCP-MAE
Shape Representation Of 3D Point CloudsModelNet40 10-way (10-shot)Overall Accuracy93.5PCP-MAE
Shape Representation Of 3D Point CloudsModelNet40 10-way (10-shot)Standard Deviation3.7PCP-MAE
Shape Representation Of 3D Point CloudsModelNet40 5-way (20-shot)Overall Accuracy99.1PCP-MAE
Shape Representation Of 3D Point CloudsModelNet40 5-way (20-shot)Standard Deviation0.8PCP-MAE
3D Point Cloud ClassificationScanObjectNNOBJ-BG (OA)95.52PCP-MAE
3D Point Cloud ClassificationScanObjectNNOBJ-ONLY (OA)94.32PCP-MAE
3D Point Cloud ClassificationScanObjectNNOverall Accuracy90.35PCP-MAE
3D Point Cloud ClassificationModelNet40Overall Accuracy94.2PCP-MAE
3D Point Cloud ClassificationModelNet40 10-way (20-shot)Overall Accuracy95.9PCP-MAE
3D Point Cloud ClassificationModelNet40 10-way (20-shot)Standard Deviation2.7PCP-MAE
3D Point Cloud ClassificationModelNet40 5-way (10-shot)Overall Accuracy97.4PCP-MAE
3D Point Cloud ClassificationModelNet40 5-way (10-shot)Standard Deviation2.3PCP-MAE
3D Point Cloud ClassificationModelNet40 10-way (10-shot)Overall Accuracy93.5PCP-MAE
3D Point Cloud ClassificationModelNet40 10-way (10-shot)Standard Deviation3.7PCP-MAE
3D Point Cloud ClassificationModelNet40 5-way (20-shot)Overall Accuracy99.1PCP-MAE
3D Point Cloud ClassificationModelNet40 5-way (20-shot)Standard Deviation0.8PCP-MAE
3D Point Cloud ReconstructionScanObjectNNOBJ-BG (OA)95.52PCP-MAE
3D Point Cloud ReconstructionScanObjectNNOBJ-ONLY (OA)94.32PCP-MAE
3D Point Cloud ReconstructionScanObjectNNOverall Accuracy90.35PCP-MAE
3D Point Cloud ReconstructionModelNet40Overall Accuracy94.2PCP-MAE
3D Point Cloud ReconstructionModelNet40 10-way (20-shot)Overall Accuracy95.9PCP-MAE
3D Point Cloud ReconstructionModelNet40 10-way (20-shot)Standard Deviation2.7PCP-MAE
3D Point Cloud ReconstructionModelNet40 5-way (10-shot)Overall Accuracy97.4PCP-MAE
3D Point Cloud ReconstructionModelNet40 5-way (10-shot)Standard Deviation2.3PCP-MAE
3D Point Cloud ReconstructionModelNet40 10-way (10-shot)Overall Accuracy93.5PCP-MAE
3D Point Cloud ReconstructionModelNet40 10-way (10-shot)Standard Deviation3.7PCP-MAE
3D Point Cloud ReconstructionModelNet40 5-way (20-shot)Overall Accuracy99.1PCP-MAE
3D Point Cloud ReconstructionModelNet40 5-way (20-shot)Standard Deviation0.8PCP-MAE

Related Papers

GLAD: Generalizable Tuning for Vision-Language Models2025-07-17A Semi-Supervised Learning Method for the Identification of Bad Exposures in Large Imaging Surveys2025-07-17Self-supervised Learning on Camera Trap Footage Yields a Strong Universal Face Embedder2025-07-14Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection2025-07-10An Enhanced Privacy-preserving Federated Few-shot Learning Framework for Respiratory Disease Diagnosis2025-07-10Few-Shot Learning by Explicit Physics Integration: An Application to Groundwater Heat Transport2025-07-08Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis2025-07-08ViRefSAM: Visual Reference-Guided Segment Anything Model for Remote Sensing Segmentation2025-07-03