TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Bi-directional Cross-Modality Feature Propagation with Sep...

Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation

Xiaokang Chen, Kwan-Yee Lin, Jingbo Wang, Wayne Wu, Chen Qian, Hongsheng Li, Gang Zeng

2020-07-17ECCV 2020 8Thermal Image SegmentationSegmentationSemantic SegmentationSpecificityObject Detection
PaperPDFCodeCode

Abstract

Depth information has proven to be a useful cue in the semantic segmentation of RGB-D images for providing a geometric counterpart to the RGB representation. Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion to obtain better feature representations to achieve more accurate segmentation. This, however, may not lead to satisfactory results as actual depth data are generally noisy, which might worsen the accuracy as the networks go deeper. In this paper, we propose a unified and efficient Cross-modality Guided Encoder to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively. The key of the proposed architecture is a novel Separation-and-Aggregation Gating operation that jointly filters and recalibrates both representations before cross-modality aggregation. Meanwhile, a Bi-direction Multi-step Propagation strategy is introduced, on the one hand, to help to propagate and fuse information between the two modalities, and on the other hand, to preserve their specificity along the long-term propagation process. Besides, our proposed encoder can be easily injected into the previous encoder-decoder structures to boost their performance on RGB-D semantic segmentation. Our model outperforms state-of-the-arts consistently on both in-door and out-door challenging datasets. Code of this work is available at https://charlescxk.github.io/

Results

TaskDatasetMetricValueModel
Semantic Segmentation US3DmIoU83.62SA-Gate
Semantic SegmentationTHUD Robotic DatasetmIoU83.19SA-Gate
Semantic SegmentationPortoIoU72.21SA-Gate
Semantic SegmentationLLRGBD-syntheticmIoU61.79SA-Gate (ResNet-101)
Semantic SegmentationEvent-based Segmentation DatasetmIoU84.08SA-Gate
Semantic Segmentation PotsdammIoU84.28SA-Gate
Semantic SegmentationTLCGISIoU84.2SA-Gate
Semantic SegmentationUrbanLFmIoU (Syn)79.53SA-Gate
Semantic SegmentationEventScapemIoU53.94SA-Gate
Semantic SegmentationVaihingenmIoU81.03SA-Gate
Semantic SegmentationBJRoadIoU62.14SA-Gate
Semantic SegmentationNoisy RS RGB-T DatasetmIoU54SA-Gate
Semantic SegmentationMFN DatasetmIOU45.8SA-Gate
Object DetectionDSECmAP19.6SAGate
Object DetectionPKU-DDD17-Car mAP5082SAGate
3DDSECmAP19.6SAGate
3DPKU-DDD17-Car mAP5082SAGate
2D ClassificationDSECmAP19.6SAGate
2D ClassificationPKU-DDD17-Car mAP5082SAGate
Scene SegmentationNoisy RS RGB-T DatasetmIoU54SA-Gate
Scene SegmentationMFN DatasetmIOU45.8SA-Gate
2D Object DetectionDSECmAP19.6SAGate
2D Object DetectionPKU-DDD17-Car mAP5082SAGate
2D Object DetectionNoisy RS RGB-T DatasetmIoU54SA-Gate
2D Object DetectionMFN DatasetmIOU45.8SA-Gate
10-shot image generation US3DmIoU83.62SA-Gate
10-shot image generationTHUD Robotic DatasetmIoU83.19SA-Gate
10-shot image generationPortoIoU72.21SA-Gate
10-shot image generationLLRGBD-syntheticmIoU61.79SA-Gate (ResNet-101)
10-shot image generationEvent-based Segmentation DatasetmIoU84.08SA-Gate
10-shot image generation PotsdammIoU84.28SA-Gate
10-shot image generationTLCGISIoU84.2SA-Gate
10-shot image generationUrbanLFmIoU (Syn)79.53SA-Gate
10-shot image generationEventScapemIoU53.94SA-Gate
10-shot image generationVaihingenmIoU81.03SA-Gate
10-shot image generationBJRoadIoU62.14SA-Gate
10-shot image generationNoisy RS RGB-T DatasetmIoU54SA-Gate
10-shot image generationMFN DatasetmIOU45.8SA-Gate
16kDSECmAP19.6SAGate
16kPKU-DDD17-Car mAP5082SAGate

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17