TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Point Transformer V3: Simpler, Faster, Stronger

Point Transformer V3: Simpler, Faster, Stronger

Xiaoyang Wu, Li Jiang, Peng-Shuai Wang, Zhijian Liu, Xihui Liu, Yu Qiao, Wanli Ouyang, Tong He, Hengshuang Zhao

2023-12-15Representation LearningSemantic Segmentation3D Semantic SegmentationLIDAR Semantic Segmentation
PaperPDFCode(official)Code(official)Code

Abstract

This paper is not motivated to seek innovation within the attention mechanism. Instead, it focuses on overcoming the existing trade-offs between accuracy and efficiency within the context of point cloud processing, leveraging the power of scale. Drawing inspiration from recent advances in 3D large-scale representation learning, we recognize that model performance is more influenced by scale than by intricate design. Therefore, we present Point Transformer V3 (PTv3), which prioritizes simplicity and efficiency over the accuracy of certain mechanisms that are minor to the overall performance after scaling, such as replacing the precise neighbor search by KNN with an efficient serialized neighbor mapping of point clouds organized with specific patterns. This principle enables significant scaling, expanding the receptive field from 16 to 1024 points while remaining efficient (a 3x increase in processing speed and a 10x improvement in memory efficiency compared with its predecessor, PTv2). PTv3 attains state-of-the-art results on over 20 downstream tasks that span both indoor and outdoor scenarios. Further enhanced with multi-dataset joint training, PTv3 pushes these results to a higher level.

Results

TaskDatasetMetricValueModel
Semantic SegmentationScanNettest mIoU79.4PTv3 + PPT
Semantic SegmentationScanNetval mIoU78.6PTv3 + PPT
Semantic SegmentationS3DIS Area5mAcc80.1PTv3 + PPT
Semantic SegmentationS3DIS Area5mIoU74.7PTv3 + PPT
Semantic SegmentationS3DIS Area5oAcc92PTv3 + PPT
Semantic SegmentationS3DISMean IoU80.8PTv3 + PPT
Semantic SegmentationS3DISmAcc87.7PTv3 + PPT
Semantic SegmentationS3DISoAcc92.6PTv3 + PPT
Semantic SegmentationScanNet200test mIoU39.3PTv3 + PPT
Semantic SegmentationScanNet200val mIoU36PTv3 + PPT
Semantic SegmentationScanNet++Top-1 IoU0.488PTv3
Semantic SegmentationScanNet++Top-3 IoU0.725PTv3
3D Semantic SegmentationScanNet200test mIoU39.3PTv3 + PPT
3D Semantic SegmentationScanNet200val mIoU36PTv3 + PPT
3D Semantic SegmentationScanNet++Top-1 IoU0.488PTv3
3D Semantic SegmentationScanNet++Top-3 IoU0.725PTv3
LIDAR Semantic SegmentationnuScenestest mIoU0.83PTv3 + PPT
LIDAR Semantic SegmentationnuScenesval mIoU0.812PTv3 + PPT
10-shot image generationScanNettest mIoU79.4PTv3 + PPT
10-shot image generationScanNetval mIoU78.6PTv3 + PPT
10-shot image generationS3DIS Area5mAcc80.1PTv3 + PPT
10-shot image generationS3DIS Area5mIoU74.7PTv3 + PPT
10-shot image generationS3DIS Area5oAcc92PTv3 + PPT
10-shot image generationS3DISMean IoU80.8PTv3 + PPT
10-shot image generationS3DISmAcc87.7PTv3 + PPT
10-shot image generationS3DISoAcc92.6PTv3 + PPT
10-shot image generationScanNet200test mIoU39.3PTv3 + PPT
10-shot image generationScanNet200val mIoU36PTv3 + PPT
10-shot image generationScanNet++Top-1 IoU0.488PTv3
10-shot image generationScanNet++Top-3 IoU0.725PTv3

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Boosting Team Modeling through Tempo-Relational Representation Learning2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17