TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Mask3D: Mask Transformer for 3D Semantic Instance Segmenta...

Mask3D: Mask Transformer for 3D Semantic Instance Segmentation

Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, Bastian Leibe

2022-10-063D Instance SegmentationSegmentationSemantic SegmentationInstance Segmentation3D Semantic Instance Segmentation
PaperPDFCode(official)

Abstract

Modern 3D semantic instance segmentation approaches predominantly rely on specialized voting mechanisms followed by carefully designed geometric clustering techniques. Building on the successes of recent Transformer-based methods for object detection and image segmentation, we propose the first Transformer-based approach for 3D semantic instance segmentation. We show that we can leverage generic Transformer building blocks to directly predict instance masks from 3D point clouds. In our model called Mask3D each object instance is represented as an instance query. Using Transformer decoders, the instance queries are learned by iteratively attending to point cloud features at multiple scales. Combined with point features, the instance queries directly yield all instance masks in parallel. Mask3D has several advantages over current state-of-the-art approaches, since it neither relies on (1) voting schemes which require hand-selected geometric properties (such as centers) nor (2) geometric grouping mechanisms requiring manually-tuned hyper-parameters (e.g. radii) and (3) enables a loss that directly optimizes instance masks. Mask3D sets a new state-of-the-art on ScanNet test (+6.2 mAP), S3DIS 6-fold (+10.1 mAP), STPLS3D (+11.2 mAP) and ScanNet200 test (+12.4 mAP).

Results

TaskDatasetMetricValueModel
Semantic SegmentationReplicamIoU22.6Mask3D
Instance SegmentationS3DISAP@5075.5Mask3D
Instance SegmentationS3DISmAP64.5Mask3D
Instance SegmentationScanNet(v2)mAP55.2Mask3D
Instance SegmentationScanNet(v2)mAP @ 5078Mask3D
Instance SegmentationScanNet(v2)mAP@2587Mask3D
Instance SegmentationScanNet200mAP27.8Mask3D
Instance SegmentationSTPLS3DAP57.3Mask3D
Instance SegmentationSTPLS3DAP2581.6Mask3D
Instance SegmentationSTPLS3DAP5074.3Mask3D
10-shot image generationReplicamIoU22.6Mask3D
3D Instance SegmentationS3DISAP@5075.5Mask3D
3D Instance SegmentationS3DISmAP64.5Mask3D
3D Instance SegmentationScanNet(v2)mAP55.2Mask3D
3D Instance SegmentationScanNet(v2)mAP @ 5078Mask3D
3D Instance SegmentationScanNet(v2)mAP@2587Mask3D
3D Instance SegmentationScanNet200mAP27.8Mask3D
3D Instance SegmentationSTPLS3DAP57.3Mask3D
3D Instance SegmentationSTPLS3DAP2581.6Mask3D
3D Instance SegmentationSTPLS3DAP5074.3Mask3D

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17