ISDA: Position-Aware Instance Segmentation with Deformable Attention

Kaining Ying, Zhenhua Wang, Cong Bai, Pengfei Zhou

2022-02-23Segmentation Semantic Segmentation Instance Segmentation

Abstract

Most instance segmentation models are not end-to-end trainable due to either the incorporation of proposal estimation (RPN) as a pre-processing or non-maximum suppression (NMS) as a post-processing. Here we propose a novel end-to-end instance segmentation method termed ISDA. It reshapes the task into predicting a set of object masks, which are generated via traditional convolution operation with learned position-aware kernels and features of objects. Such kernels and features are learned by leveraging a deformable attention network with multi-scale representation. Thanks to the introduced set-prediction mechanism, the proposed method is NMS-free. Empirically, ISDA outperforms Mask R-CNN (the strong baseline) by 2.6 points on MS-COCO, and achieves leading performance compared with recent models. Code will be available soon.

Results

Task	Dataset	Metric	Value	Model
Instance Segmentation	COCO test-dev	AP50	62	ISDA (ours)
Instance Segmentation	COCO test-dev	AP75	41.1	ISDA (ours)
Instance Segmentation	COCO test-dev	APM	41.2	ISDA (ours)
Instance Segmentation	COCO test-dev	APS	17	ISDA (ours)
Instance Segmentation	COCO test-dev	mask AP	38.7	ISDA (ours)
Instance Segmentation	COCO test-dev	APL	55.7	ISDA (ResNet-50)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21 Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17 DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17 From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17 Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17 SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17 Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17 A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17