TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/P2P: Tuning Pre-trained Image Models for Point Cloud Analy...

P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting

Ziyi Wang, Xumin Yu, Yongming Rao, Jie zhou, Jiwen Lu

2022-08-043D Part Segmentation3D Point Cloud Classification
PaperPDFCode(official)

Abstract

Nowadays, pre-training big models on large-scale datasets has become a crucial topic in deep learning. The pre-trained models with high representation ability and transferability achieve a great success and dominate many downstream tasks in natural language processing and 2D vision. However, it is non-trivial to promote such a pretraining-tuning paradigm to the 3D vision, given the limited training data that are relatively inconvenient to collect. In this paper, we provide a new perspective of leveraging pre-trained 2D knowledge in 3D domain to tackle this problem, tuning pre-trained image models with the novel Point-to-Pixel prompting for point cloud analysis at a minor parameter cost. Following the principle of prompting engineering, we transform point clouds into colorful images with geometry-preserved projection and geometry-aware coloring to adapt to pre-trained image models, whose weights are kept frozen during the end-to-end optimization of point cloud analysis tasks. We conduct extensive experiments to demonstrate that cooperating with our proposed Point-to-Pixel Prompting, better pre-trained image model will lead to consistently better performance in 3D vision. Enjoying prosperous development from image pre-training field, our method attains 89.3% accuracy on the hardest setting of ScanObjectNN, surpassing conventional point cloud models with much fewer trainable parameters. Our framework also exhibits very competitive performance on ModelNet classification and ShapeNet Part Segmentation. Code is available at https://github.com/wangzy22/P2P.

Results

TaskDatasetMetricValueModel
Semantic SegmentationShapeNet-PartInstance Average IoU86.5P2P
Shape Representation Of 3D Point CloudsScanObjectNNOverall Accuracy89.3P2P
Shape Representation Of 3D Point CloudsModelNet40Mean Accuracy91.6P2P
Shape Representation Of 3D Point CloudsModelNet40Overall Accuracy94P2P
3D Point Cloud ClassificationScanObjectNNOverall Accuracy89.3P2P
3D Point Cloud ClassificationModelNet40Mean Accuracy91.6P2P
3D Point Cloud ClassificationModelNet40Overall Accuracy94P2P
10-shot image generationShapeNet-PartInstance Average IoU86.5P2P
3D Point Cloud ReconstructionScanObjectNNOverall Accuracy89.3P2P
3D Point Cloud ReconstructionModelNet40Mean Accuracy91.6P2P
3D Point Cloud ReconstructionModelNet40Overall Accuracy94P2P

Related Papers

Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation Learning2025-06-26Rethinking Gradient-based Adversarial Attacks on Point Cloud Classification2025-05-28SMART-PC: Skeletal Model Adaptation for Robust Test-Time Training in Point Clouds2025-05-26DG-MVP: 3D Domain Generalization via Multiple Views of Point Clouds for Classification2025-04-16HoloPart: Generative 3D Part Amodal Segmentation2025-04-10Introducing the Short-Time Fourier Kolmogorov Arnold Network: A Dynamic Graph CNN Approach for Tree Species Classification in 3D Point Clouds2025-03-31Open-Vocabulary Semantic Part Segmentation of 3D Human2025-02-27Point-LN: A Lightweight Framework for Efficient Point Cloud Classification Using Non-Parametric Positional Encoding2025-01-24