Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation

Xiaokang Chen, Kwan-Yee Lin, Jingbo Wang, Wayne Wu, Chen Qian, Hongsheng Li, Gang Zeng

2020-07-17ECCV 2020 8Thermal Image Segmentation Segmentation Semantic Segmentation Specificity Object Detection

Abstract

Depth information has proven to be a useful cue in the semantic segmentation of RGB-D images for providing a geometric counterpart to the RGB representation. Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion to obtain better feature representations to achieve more accurate segmentation. This, however, may not lead to satisfactory results as actual depth data are generally noisy, which might worsen the accuracy as the networks go deeper. In this paper, we propose a unified and efficient Cross-modality Guided Encoder to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively. The key of the proposed architecture is a novel Separation-and-Aggregation Gating operation that jointly filters and recalibrates both representations before cross-modality aggregation. Meanwhile, a Bi-direction Multi-step Propagation strategy is introduced, on the one hand, to help to propagate and fuse information between the two modalities, and on the other hand, to preserve their specificity along the long-term propagation process. Besides, our proposed encoder can be easily injected into the previous encoder-decoder structures to boost their performance on RGB-D semantic segmentation. Our model outperforms state-of-the-arts consistently on both in-door and out-door challenging datasets. Code of this work is available at https://charlescxk.github.io/

Results

Task	Dataset	Metric	Value	Model
Semantic Segmentation	US3D	mIoU	83.62	SA-Gate
Semantic Segmentation	THUD Robotic Dataset	mIoU	83.19	SA-Gate
Semantic Segmentation	Porto	IoU	72.21	SA-Gate
Semantic Segmentation	LLRGBD-synthetic	mIoU	61.79	SA-Gate (ResNet-101)
Semantic Segmentation	Event-based Segmentation Dataset	mIoU	84.08	SA-Gate
Semantic Segmentation	Potsdam	mIoU	84.28	SA-Gate
Semantic Segmentation	TLCGIS	IoU	84.2	SA-Gate
Semantic Segmentation	UrbanLF	mIoU (Syn)	79.53	SA-Gate
Semantic Segmentation	EventScape	mIoU	53.94	SA-Gate
Semantic Segmentation	Vaihingen	mIoU	81.03	SA-Gate
Semantic Segmentation	BJRoad	IoU	62.14	SA-Gate
Semantic Segmentation	Noisy RS RGB-T Dataset	mIoU	54	SA-Gate
Semantic Segmentation	MFN Dataset	mIOU	45.8	SA-Gate
Object Detection	DSEC	mAP	19.6	SAGate
Object Detection	PKU-DDD17-Car	mAP50	82	SAGate
3D	DSEC	mAP	19.6	SAGate
3D	PKU-DDD17-Car	mAP50	82	SAGate
2D Classification	DSEC	mAP	19.6	SAGate
2D Classification	PKU-DDD17-Car	mAP50	82	SAGate
Scene Segmentation	Noisy RS RGB-T Dataset	mIoU	54	SA-Gate
Scene Segmentation	MFN Dataset	mIOU	45.8	SA-Gate
2D Object Detection	DSEC	mAP	19.6	SAGate
2D Object Detection	PKU-DDD17-Car	mAP50	82	SAGate
2D Object Detection	Noisy RS RGB-T Dataset	mIoU	54	SA-Gate
2D Object Detection	MFN Dataset	mIOU	45.8	SA-Gate
10-shot image generation	US3D	mIoU	83.62	SA-Gate
10-shot image generation	THUD Robotic Dataset	mIoU	83.19	SA-Gate
10-shot image generation	Porto	IoU	72.21	SA-Gate
10-shot image generation	LLRGBD-synthetic	mIoU	61.79	SA-Gate (ResNet-101)
10-shot image generation	Event-based Segmentation Dataset	mIoU	84.08	SA-Gate
10-shot image generation	Potsdam	mIoU	84.28	SA-Gate
10-shot image generation	TLCGIS	IoU	84.2	SA-Gate
10-shot image generation	UrbanLF	mIoU (Syn)	79.53	SA-Gate
10-shot image generation	EventScape	mIoU	53.94	SA-Gate
10-shot image generation	Vaihingen	mIoU	81.03	SA-Gate
10-shot image generation	BJRoad	IoU	62.14	SA-Gate
10-shot image generation	Noisy RS RGB-T Dataset	mIoU	54	SA-Gate
10-shot image generation	MFN Dataset	mIOU	45.8	SA-Gate
16k	DSEC	mAP	19.6	SAGate
16k	PKU-DDD17-Car	mAP50	82	SAGate

Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation

Abstract

Results

Related Papers

Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation

Abstract

Results

Related Papers