Siqi Fan, Zhe Wang, Yan Wang, Jingjing Liu
For semantic segmentation in urban scene understanding, RGB cameras alone often fail to capture a clear holistic topology in challenging lighting conditions. Thermal signal is an informative additional channel that can bring to light the contour and fine-grained texture of blurred regions in low-quality RGB image. Aiming at practical RGB-T (thermal) segmentation, we systematically propose a Spatial-aware Demand-guided Recursive Meshing (SpiderMesh) framework that: 1) proactively compensates inadequate contextual semantics in optically-impaired regions via a demand-guided target masking algorithm; 2) refines multimodal semantic features with recursive meshing to improve pixel-level semantic analysis performance. We further introduce an asymmetric data augmentation technique M-CutOut, and enable semi-supervised learning to fully utilize RGB-T labels only sparsely available in practical use. Extensive experiments on MFNet and PST900 datasets demonstrate that SpiderMesh achieves state-of-the-art performance on standard RGB-T segmentation benchmarks.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | PST900 | mIoU | 82.3 | SpiderMesh |
| Semantic Segmentation | MFN Dataset | mIOU | 58.4 | SpiderMesh (B4) |
| Semantic Segmentation | MFN Dataset | mIOU | 57.9 | SpiderMesh (ResNet-152) |
| Semantic Segmentation | MFN Dataset | mIOU | 56.1 | SpiderMesh (ResNet-101) |
| Semantic Segmentation | MFN Dataset | mIOU | 54.4 | SpiderMesh (ResNet-50) |
| Scene Segmentation | PST900 | mIoU | 82.3 | SpiderMesh |
| Scene Segmentation | MFN Dataset | mIOU | 58.4 | SpiderMesh (B4) |
| Scene Segmentation | MFN Dataset | mIOU | 57.9 | SpiderMesh (ResNet-152) |
| Scene Segmentation | MFN Dataset | mIOU | 56.1 | SpiderMesh (ResNet-101) |
| Scene Segmentation | MFN Dataset | mIOU | 54.4 | SpiderMesh (ResNet-50) |
| 2D Object Detection | PST900 | mIoU | 82.3 | SpiderMesh |
| 2D Object Detection | MFN Dataset | mIOU | 58.4 | SpiderMesh (B4) |
| 2D Object Detection | MFN Dataset | mIOU | 57.9 | SpiderMesh (ResNet-152) |
| 2D Object Detection | MFN Dataset | mIOU | 56.1 | SpiderMesh (ResNet-101) |
| 2D Object Detection | MFN Dataset | mIOU | 54.4 | SpiderMesh (ResNet-50) |
| 10-shot image generation | PST900 | mIoU | 82.3 | SpiderMesh |
| 10-shot image generation | MFN Dataset | mIOU | 58.4 | SpiderMesh (B4) |
| 10-shot image generation | MFN Dataset | mIOU | 57.9 | SpiderMesh (ResNet-152) |
| 10-shot image generation | MFN Dataset | mIOU | 56.1 | SpiderMesh (ResNet-101) |
| 10-shot image generation | MFN Dataset | mIOU | 54.4 | SpiderMesh (ResNet-50) |