TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/StitchFusion: Weaving Any Visual Modalities to Enhance Mul...

StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation

2024-08-02Thermal Image SegmentationSegmentationSemantic Segmentation
PaperPDFCode

Abstract

Multimodal semantic segmentation shows significant potential for enhancing segmentation accuracy in complex scenes. However, current methods often incorporate specialized feature fusion modules tailored to specific modalities, thereby restricting input flexibility and increasing the number of training parameters. To address these challenges, we propose StitchFusion, a straightforward yet effective modal fusion framework that integrates large-scale pre-trained models directly as encoders and feature fusers. This approach facilitates comprehensive multi-modal and multi-scale feature fusion, accommodating any visual modal inputs. Specifically, Our framework achieves modal integration during encoding by sharing multi-modal visual information. To enhance information exchange across modalities, we introduce a multi-directional adapter module (MultiAdapter) to enable cross-modal information transfer during encoding. By leveraging MultiAdapter to propagate multi-scale information across pre-trained encoders during the encoding process, StitchFusion achieves multi-modal visual information integration during encoding. Extensive comparative experiments demonstrate that our model achieves state-of-the-art performance on four multi-modal segmentation datasets with minimal additional parameters. Furthermore, the experimental integration of MultiAdapter with existing Feature Fusion Modules (FFMs) highlights their complementary nature. Our code is available at StitchFusion_repo.

Results

TaskDatasetMetricValueModel
Semantic SegmentationMCubeSmIoU53.92StitchFusion (RGB-A-D-N)
Semantic SegmentationMCubeSmIoU53.26StitchFusion (RGB-A-D)
Semantic SegmentationMCubeSmIoU53.21StitchFusion (RGB-N)
Semantic SegmentationMCubeSmIoU52.72StitchFusion (RGB-D)
Semantic SegmentationMCubeSmIoU52.68StitchFusion (RGB-A)
Semantic SegmentationFMB DatasetmIoU64.32StitchFusion+FFMs (RGB-Infrared)
Semantic SegmentationFMB DatasetmIoU63.3StitchFusion (RGB-Infrared)
Semantic SegmentationDeLiVER mIoU68.18StitchFusion(RGB-D-E-LiDAR)
Semantic SegmentationDeLiVER mIoU66.65StitchFusion (RGB-D-LiDAR)
Semantic SegmentationDeLiVER mIoU66.03StitchFusion (RGB-D-Event)
Semantic SegmentationDeLiVER mIoU65.75StitchFusion (RGB-Depth)
Semantic SegmentationDeLiVER mIoU58.03StitchFusion (RGB-LiDAR)
Semantic SegmentationDeLiVER mIoU57.44StitchFusion (RGB-Event)
Semantic SegmentationPST900mIoU85.35StitchFusion (RGB-T)
Semantic SegmentationMFN DatasetmIOU58.13StitchFusion
Scene SegmentationPST900mIoU85.35StitchFusion (RGB-T)
Scene SegmentationMFN DatasetmIOU58.13StitchFusion
2D Object DetectionPST900mIoU85.35StitchFusion (RGB-T)
2D Object DetectionMFN DatasetmIOU58.13StitchFusion
10-shot image generationMCubeSmIoU53.92StitchFusion (RGB-A-D-N)
10-shot image generationMCubeSmIoU53.26StitchFusion (RGB-A-D)
10-shot image generationMCubeSmIoU53.21StitchFusion (RGB-N)
10-shot image generationMCubeSmIoU52.72StitchFusion (RGB-D)
10-shot image generationMCubeSmIoU52.68StitchFusion (RGB-A)
10-shot image generationFMB DatasetmIoU64.32StitchFusion+FFMs (RGB-Infrared)
10-shot image generationFMB DatasetmIoU63.3StitchFusion (RGB-Infrared)
10-shot image generationDeLiVER mIoU68.18StitchFusion(RGB-D-E-LiDAR)
10-shot image generationDeLiVER mIoU66.65StitchFusion (RGB-D-LiDAR)
10-shot image generationDeLiVER mIoU66.03StitchFusion (RGB-D-Event)
10-shot image generationDeLiVER mIoU65.75StitchFusion (RGB-Depth)
10-shot image generationDeLiVER mIoU58.03StitchFusion (RGB-LiDAR)
10-shot image generationDeLiVER mIoU57.44StitchFusion (RGB-Event)
10-shot image generationPST900mIoU85.35StitchFusion (RGB-T)
10-shot image generationMFN DatasetmIOU58.13StitchFusion

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17