TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D...

Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance

Phuc D. A. Nguyen, Tuan Duc Ngo, Evangelos Kalogerakis, Chuang Gan, Anh Tran, Cuong Pham, Khoi Nguyen

2023-12-17CVPR 2024 13D Instance SegmentationScene UnderstandingSemantic SegmentationObject Localization3D Open-Vocabulary Instance SegmentationInstance Segmentation
PaperPDFCode(official)

Abstract

We introduce Open3DIS, a novel solution designed to tackle the problem of Open-Vocabulary Instance Segmentation within 3D scenes. Objects within 3D environments exhibit diverse shapes, scales, and colors, making precise instance-level identification a challenging task. Recent advancements in Open-Vocabulary scene understanding have made significant strides in this area by employing class-agnostic 3D instance proposal networks for object localization and learning queryable features for each 3D mask. While these methods produce high-quality instance proposals, they struggle with identifying small-scale and geometrically ambiguous objects. The key idea of our method is a new module that aggregates 2D instance masks across frames and maps them to geometrically coherent point cloud regions as high-quality object proposals addressing the above limitations. These are then combined with 3D class-agnostic instance proposals to include a wide range of objects in the real world. To validate our approach, we conducted experiments on three prominent datasets, including ScanNet200, S3DIS, and Replica, demonstrating significant performance gains in segmenting objects with diverse categories over the state-of-the-art approaches.

Results

TaskDatasetMetricValueModel
Instance SegmentationScanNet++mAP20.7Open3DIS
Instance SegmentationScanNet200mAP23.7Open3DIS (Open-Vocabulary)
3D Open-Vocabulary Instance SegmentationScanNet200AP Common21.2Open3DIS
3D Open-Vocabulary Instance SegmentationScanNet200AP Head27.8Open3DIS
3D Open-Vocabulary Instance SegmentationScanNet200AP Tail21.8Open3DIS
3D Open-Vocabulary Instance SegmentationScanNet200AP2532.8Open3DIS
3D Open-Vocabulary Instance SegmentationScanNet200AP5029.4Open3DIS
3D Open-Vocabulary Instance SegmentationScanNet200mAP23.7Open3DIS
3D Open-Vocabulary Instance SegmentationReplicamAP18.1Open3DIS
3D Open-Vocabulary Instance SegmentationS3DISAP50 Base B6/N650Open3DIS
3D Open-Vocabulary Instance SegmentationS3DISAP50 Base B8/N4 60.8Open3DIS
3D Open-Vocabulary Instance SegmentationS3DISAP50 Novel B6/N629Open3DIS
3D Open-Vocabulary Instance SegmentationS3DISAP50 Novel B8/N426.3Open3DIS
3D Instance SegmentationScanNet++mAP20.7Open3DIS
3D Instance SegmentationScanNet200mAP23.7Open3DIS (Open-Vocabulary)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Advancing Complex Wide-Area Scene Understanding with Hierarchical Coresets Selection2025-07-17Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17