Any3DIS: Class-Agnostic 3D Instance Segmentation by 2D Mask Tracking

Phuc Nguyen, Minh Luu, Anh Tran, Cuong Pham, Khoi Nguyen

2024-11-25CVPR 2025 13D Instance Segmentation Segmentation Semantic Segmentation 3D Open-Vocabulary Instance Segmentation Instance Segmentation

Paper PDF

Abstract

Existing 3D instance segmentation methods frequently encounter issues with over-segmentation, leading to redundant and inaccurate 3D proposals that complicate downstream tasks. This challenge arises from their unsupervised merging approach, where dense 2D instance masks are lifted across frames into point clouds to form 3D candidate proposals without direct supervision. These candidates are then hierarchically merged based on heuristic criteria, often resulting in numerous redundant segments that fail to combine into precise 3D proposals. To overcome these limitations, we propose a 3D-Aware 2D Mask Tracking module that uses robust 3D priors from a 2D mask segmentation and tracking foundation model (SAM-2) to ensure consistent object masks across video frames. Rather than merging all visible superpoints across views to create a 3D mask, our 3D Mask Optimization module leverages a dynamic programming algorithm to select an optimal set of views, refining the superpoints to produce a final 3D proposal for each object. Our approach achieves comprehensive object coverage within the scene while reducing unnecessary proposals, which could otherwise impair downstream applications. Evaluations on ScanNet200 and ScanNet++ confirm the effectiveness of our method, with improvements across Class-Agnostic, Open-Vocabulary, and Open-Ended 3D Instance Segmentation tasks.

Results

Task	Dataset	Metric	Value	Model
Instance Segmentation	ScanNet++	mAP	22.2	Any3DIS
3D Open-Vocabulary Instance Segmentation	ScanNet200	AP Common	23.8	Any3DIS
3D Open-Vocabulary Instance Segmentation	ScanNet200	AP Head	27.4	Any3DIS
3D Open-Vocabulary Instance Segmentation	ScanNet200	AP Tail	26.4	Any3DIS
3D Open-Vocabulary Instance Segmentation	ScanNet200	mAP	25.8	Any3DIS
3D Instance Segmentation	ScanNet++	mAP	22.2	Any3DIS

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21 Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17 DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17 From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17 Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17 SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17 Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17 A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17