TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Explicit Attention-Enhanced Fusion for RGB-Thermal Percept...

Explicit Attention-Enhanced Fusion for RGB-Thermal Perception Tasks

Mingjian Liang, Junjie Hu, Chenyu Bao, Hua Feng, Fuqin Deng, Tin Lun Lam

2023-03-28Thermal Image SegmentationCrowd CountingSemantic SegmentationSalient Object Detectionobject-detectionObject Detection
PaperPDFCode(official)

Abstract

Recently, RGB-Thermal based perception has shown significant advances. Thermal information provides useful clues when visual cameras suffer from poor lighting conditions, such as low light and fog. However, how to effectively fuse RGB images and thermal data remains an open challenge. Previous works involve naive fusion strategies such as merging them at the input, concatenating multi-modality features inside models, or applying attention to each data modality. These fusion strategies are straightforward yet insufficient. In this paper, we propose a novel fusion method named Explicit Attention-Enhanced Fusion (EAEF) that fully takes advantage of each type of data. Specifically, we consider the following cases: i) both RGB data and thermal data, ii) only one of the types of data, and iii) none of them generate discriminative features. EAEF uses one branch to enhance feature extraction for i) and iii) and the other branch to remedy insufficient representations for ii). The outputs of two branches are fused to form complementary features. As a result, the proposed fusion method outperforms state-of-the-art by 1.6\% in mIoU on semantic segmentation, 3.1\% in MAE on salient object detection, 2.3\% in mAP on object detection, and 8.1\% in MAE on crowd counting. The code is available at https://github.com/FreeformRobotics/EAEFNet.

Results

TaskDatasetMetricValueModel
Semantic SegmentationNoisy RS RGB-T DatasetmIoU60EAEFNet
Semantic SegmentationMFN DatasetmIOU58.9EAEFNet (ResNet-152)
Semantic SegmentationMFN DatasetmIOU55.9EAFFNet (ResNet-50)
Scene SegmentationNoisy RS RGB-T DatasetmIoU60EAEFNet
Scene SegmentationMFN DatasetmIOU58.9EAEFNet (ResNet-152)
Scene SegmentationMFN DatasetmIOU55.9EAFFNet (ResNet-50)
2D Object DetectionNoisy RS RGB-T DatasetmIoU60EAEFNet
2D Object DetectionMFN DatasetmIOU58.9EAEFNet (ResNet-152)
2D Object DetectionMFN DatasetmIOU55.9EAFFNet (ResNet-50)
10-shot image generationNoisy RS RGB-T DatasetmIoU60EAEFNet
10-shot image generationMFN DatasetmIOU58.9EAEFNet (ResNet-152)
10-shot image generationMFN DatasetmIOU55.9EAFFNet (ResNet-50)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images2025-07-17Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection2025-07-17