Qianyu Zhou, Zhengyang Feng, Qiqi Gu, Jiangmiao Pang, Guangliang Cheng, Xuequan Lu, Jianping Shi, Lizhuang Ma
Unsupervised domain adaptation (UDA) aims to adapt a model of the labeled source domain to an unlabeled target domain. Existing UDA-based semantic segmentation approaches always reduce the domain shifts in pixel level, feature level, and output level. However, almost all of them largely neglect the contextual dependency, which is generally shared across different domains, leading to less-desired performance. In this paper, we propose a novel Context-Aware Mixup (CAMix) framework for domain adaptive semantic segmentation, which exploits this important clue of context-dependency as explicit prior knowledge in a fully end-to-end trainable manner for enhancing the adaptability toward the target domain. Firstly, we present a contextual mask generation strategy by leveraging the accumulated spatial distributions and prior contextual relationships. The generated contextual mask is critical in this work and will guide the context-aware domain mixup on three different levels. Besides, provided the context knowledge, we introduce a significance-reweighted consistency loss to penalize the inconsistency between the mixed student prediction and the mixed teacher prediction, which alleviates the negative transfer of the adaptation, e.g., early performance degradation. Extensive experiments and analysis demonstrate the effectiveness of our method against the state-of-the-art approaches on widely-used UDA benchmarks.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image-to-Image Translation | SYNTHIA-to-Cityscapes | mIoU (13 classes) | 69.2 | CAMix (w DAFormer) |
| Image-to-Image Translation | SYNTHIA-to-Cityscapes | mIoU (13 classes) | 59.7 | CAMix (w Deeplabv2 ResNet 101) |
| Image-to-Image Translation | GTAV-to-Cityscapes Labels | mIoU | 70 | CAMix (w DAFormer) |
| Image-to-Image Translation | GTAV-to-Cityscapes Labels | mIoU | 55.2 | CAMix (w Deeplabv2 ResNet 101) |
| Image-to-Image Translation | GTAV-to-Cityscapes Labels | mIoU | 70 | CAMix (w DAFormer) |
| Image-to-Image Translation | GTAV-to-Cityscapes Labels | mIoU | 55.2 | CAMix (w Deeplabv2 ResNet101) |
| Image-to-Image Translation | SYNTHIA-to-Cityscapes | MIoU (13 classes) | 69.2 | CAMix (w DAFormer) |
| Image-to-Image Translation | SYNTHIA-to-Cityscapes | MIoU (13 classes) | 59.7 | CAMix (ResNet 101) |
| Domain Adaptation | GTAV-to-Cityscapes Labels | mIoU | 70 | CAMix (w DAFormer) |
| Domain Adaptation | GTAV-to-Cityscapes Labels | mIoU | 55.2 | CAMix (w Deeplabv2 ResNet 101) |
| Image Generation | SYNTHIA-to-Cityscapes | mIoU (13 classes) | 69.2 | CAMix (w DAFormer) |
| Image Generation | SYNTHIA-to-Cityscapes | mIoU (13 classes) | 59.7 | CAMix (w Deeplabv2 ResNet 101) |
| Image Generation | GTAV-to-Cityscapes Labels | mIoU | 70 | CAMix (w DAFormer) |
| Image Generation | GTAV-to-Cityscapes Labels | mIoU | 55.2 | CAMix (w Deeplabv2 ResNet 101) |
| Image Generation | GTAV-to-Cityscapes Labels | mIoU | 70 | CAMix (w DAFormer) |
| Image Generation | GTAV-to-Cityscapes Labels | mIoU | 55.2 | CAMix (w Deeplabv2 ResNet101) |
| Image Generation | SYNTHIA-to-Cityscapes | MIoU (13 classes) | 69.2 | CAMix (w DAFormer) |
| Image Generation | SYNTHIA-to-Cityscapes | MIoU (13 classes) | 59.7 | CAMix (ResNet 101) |
| Unsupervised Domain Adaptation | GTAV-to-Cityscapes Labels | mIoU | 70 | CAMix (w DAFormer) |
| Unsupervised Domain Adaptation | GTAV-to-Cityscapes Labels | mIoU | 55.2 | CAMix (w Deeplabv2 ResNet 101) |
| 1 Image, 2*2 Stitching | SYNTHIA-to-Cityscapes | mIoU (13 classes) | 69.2 | CAMix (w DAFormer) |
| 1 Image, 2*2 Stitching | SYNTHIA-to-Cityscapes | mIoU (13 classes) | 59.7 | CAMix (w Deeplabv2 ResNet 101) |
| 1 Image, 2*2 Stitching | GTAV-to-Cityscapes Labels | mIoU | 70 | CAMix (w DAFormer) |
| 1 Image, 2*2 Stitching | GTAV-to-Cityscapes Labels | mIoU | 55.2 | CAMix (w Deeplabv2 ResNet 101) |
| 1 Image, 2*2 Stitching | GTAV-to-Cityscapes Labels | mIoU | 70 | CAMix (w DAFormer) |
| 1 Image, 2*2 Stitching | GTAV-to-Cityscapes Labels | mIoU | 55.2 | CAMix (w Deeplabv2 ResNet101) |
| 1 Image, 2*2 Stitching | SYNTHIA-to-Cityscapes | MIoU (13 classes) | 69.2 | CAMix (w DAFormer) |
| 1 Image, 2*2 Stitching | SYNTHIA-to-Cityscapes | MIoU (13 classes) | 59.7 | CAMix (ResNet 101) |