TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/GPSFormer: A Global Perception and Local Structure Fitting...

GPSFormer: A Global Perception and Local Structure Fitting-based Transformer for Point Cloud Understanding

Changshuo Wang, Meiqing Wu, Siew-Kei Lam, Xin Ning, Shangshu Yu, Ruiping Wang, Weijun Li, Thambipillai Srikanthan

2024-07-18Few-Shot Learning3D Point Cloud Classification
PaperPDFCode(official)

Abstract

Despite the significant advancements in pre-training methods for point cloud understanding, directly capturing intricate shape information from irregular point clouds without reliance on external data remains a formidable challenge. To address this problem, we propose GPSFormer, an innovative Global Perception and Local Structure Fitting-based Transformer, which learns detailed shape information from point clouds with remarkable precision. The core of GPSFormer is the Global Perception Module (GPM) and the Local Structure Fitting Convolution (LSFConv). Specifically, GPM utilizes Adaptive Deformable Graph Convolution (ADGConv) to identify short-range dependencies among similar features in the feature space and employs Multi-Head Attention (MHA) to learn long-range dependencies across all positions within the feature space, ultimately enabling flexible learning of contextual representations. Inspired by Taylor series, we design LSFConv, which learns both low-order fundamental and high-order refinement information from explicitly encoded local geometric structures. Integrating the GPM and LSFConv as fundamental components, we construct GPSFormer, a cutting-edge Transformer that effectively captures global and local structures of point clouds. Extensive experiments validate GPSFormer's effectiveness in three point cloud tasks: shape classification, part segmentation, and few-shot learning. The code of GPSFormer is available at \url{https://github.com/changshuowang/GPSFormer}.

Results

TaskDatasetMetricValueModel
Shape Representation Of 3D Point CloudsScanObjectNNMean Accuracy93.8GPSFormer
Shape Representation Of 3D Point CloudsScanObjectNNOverall Accuracy95.4GPSFormer
Shape Representation Of 3D Point CloudsScanObjectNNMean Accuracy92.51GPSFormer-elite
Shape Representation Of 3D Point CloudsScanObjectNNOverall Accuracy93.3GPSFormer-elite
3D Point Cloud ClassificationScanObjectNNMean Accuracy93.8GPSFormer
3D Point Cloud ClassificationScanObjectNNOverall Accuracy95.4GPSFormer
3D Point Cloud ClassificationScanObjectNNMean Accuracy92.51GPSFormer-elite
3D Point Cloud ClassificationScanObjectNNOverall Accuracy93.3GPSFormer-elite
3D Point Cloud ReconstructionScanObjectNNMean Accuracy93.8GPSFormer
3D Point Cloud ReconstructionScanObjectNNOverall Accuracy95.4GPSFormer
3D Point Cloud ReconstructionScanObjectNNMean Accuracy92.51GPSFormer-elite
3D Point Cloud ReconstructionScanObjectNNOverall Accuracy93.3GPSFormer-elite

Related Papers

GLAD: Generalizable Tuning for Vision-Language Models2025-07-17Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection2025-07-10An Enhanced Privacy-preserving Federated Few-shot Learning Framework for Respiratory Disease Diagnosis2025-07-10Few-Shot Learning by Explicit Physics Integration: An Application to Groundwater Heat Transport2025-07-08ViRefSAM: Visual Reference-Guided Segment Anything Model for Remote Sensing Segmentation2025-07-03Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation Learning2025-06-26Dynamic Context-Aware Prompt Recommendation for Domain-Specific AI Applications2025-06-25Ancient Script Image Recognition and Processing: A Review2025-06-24