QTSeg: A Query Token-Based Architecture for Efficient 2D Medical Image Segmentation

Phuong-Nam Tran, Nhat Truong Pham, Duc Ngoc Minh Dang, Eui-Nam Huh, Choong Seon Hong

2024-12-23Skin Lesion Segmentation Breast Cancer Detection Semantic Segmentation Medical Image Segmentation Image Segmentation

Paper PDF Code(official)

Abstract

Medical image segmentation is crucial in assisting medical doctors in making diagnoses and enabling accurate automatic diagnosis. While advanced convolutional neural networks (CNNs) excel in segmenting regions of interest with pixel-level precision, they often struggle with long-range dependencies, which is crucial for enhancing model performance. Conversely, transformer architectures leverage attention mechanisms to excel in handling long-range dependencies. However, the computational complexity of transformers grows quadratically, posing resource-intensive challenges, especially with high-resolution medical images. Recent research aims to combine CNN and transformer architectures to mitigate their drawbacks and enhance performance while keeping resource demands low. Nevertheless, existing approaches have not fully leveraged the strengths of both architectures to achieve high accuracy with low computational requirements. To address this gap, we propose a novel architecture for 2D medical image segmentation (QTSeg) that leverages a feature pyramid network (FPN) as the image encoder, a multi-level feature fusion (MLFF) as the adaptive module between encoder and decoder and a multi-query mask decoder (MQM Decoder) as the mask decoder. In the first step, an FPN model extracts pyramid features from the input image. Next, MLFF is incorporated between the encoder and decoder to adapt features from different encoder stages to the decoder. Finally, an MQM Decoder is employed to improve mask generation by integrating query tokens with pyramid features at all stages of the mask decoder. Our experimental results show that QTSeg outperforms state-of-the-art methods across all metrics with lower computational demands than the baseline and the existing methods. Code is available at https://github.com/tpnam0901/QTSeg (v0.1.0)

Results

Task	Dataset	Metric	Value	Model
Medical Image Segmentation	BKAI-IGH NeoPolyp-Small	Average Dice (5-folds)	93.13	QTSeg
Medical Image Segmentation	BKAI-IGH NeoPolyp-Small	MAE (5-folds)	0.06	QTSeg
Medical Image Segmentation	BKAI-IGH NeoPolyp-Small	mIoU (5-folds)	88.94	QTSeg
Medical Image Segmentation	ISIC2016	ACC	96.41	QTSeg
Medical Image Segmentation	ISIC2016	Average IOU	86.74	QTSeg
Medical Image Segmentation	ISIC2016	Dice	92.42	QTSeg
Medical Image Segmentation	ISIC2016	MAE	0.0359	QTSeg
Skin	ISIC2016	ACC	96.41	QTSeg
Skin	ISIC2016	Average IOU	86.74	QTSeg
Skin	ISIC2016	Dice	92.42	QTSeg
Skin	ISIC2016	MAE	0.0359	QTSeg

QTSeg: A Query Token-Based Architecture for Efficient 2D Medical Image Segmentation

Abstract

Results

Related Papers

QTSeg: A Query Token-Based Architecture for Efficient 2D Medical Image Segmentation

Abstract

Results

Related Papers