Haoyu Ma, Xiangru Lin, Zifeng Wu, Yizhou Yu
Unsupervised domain adaptation (UDA) in semantic segmentation is a fundamental yet promising task relieving the need for laborious annotation works. However, the domain shifts/discrepancies problem in this task compromise the final segmentation performance. Based on our observation, the main causes of the domain shifts are differences in imaging conditions, called image-level domain shifts, and differences in object category configurations called category-level domain shifts. In this paper, we propose a novel UDA pipeline that unifies image-level alignment and category-level feature distribution regularization in a coarse-to-fine manner. Specifically, on the coarse side, we propose a photometric alignment module that aligns an image in the source domain with a reference image from the target domain using a set of image-level operators; on the fine side, we propose a category-oriented triplet loss that imposes a soft constraint to regularize category centers in the source domain and a self-supervised consistency regularization method in the target domain. Experimental results show that our proposed pipeline improves the generalization capability of the final segmentation model and significantly outperforms all previous state-of-the-arts.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image-to-Image Translation | GTAV-to-Cityscapes Labels | mIoU | 56.1 | Coarse-to-Fine |
| Image-to-Image Translation | SYNTHIA-to-Cityscapes | MIoU (13 classes) | 55.5 | Coarse-to-Fine(ResNet-101) |
| Image-to-Image Translation | SYNTHIA-to-Cityscapes | MIoU (16 classes) | 48.2 | Coarse-to-Fine(ResNet-101) |
| Image Generation | GTAV-to-Cityscapes Labels | mIoU | 56.1 | Coarse-to-Fine |
| Image Generation | SYNTHIA-to-Cityscapes | MIoU (13 classes) | 55.5 | Coarse-to-Fine(ResNet-101) |
| Image Generation | SYNTHIA-to-Cityscapes | MIoU (16 classes) | 48.2 | Coarse-to-Fine(ResNet-101) |
| 1 Image, 2*2 Stitching | GTAV-to-Cityscapes Labels | mIoU | 56.1 | Coarse-to-Fine |
| 1 Image, 2*2 Stitching | SYNTHIA-to-Cityscapes | MIoU (13 classes) | 55.5 | Coarse-to-Fine(ResNet-101) |
| 1 Image, 2*2 Stitching | SYNTHIA-to-Cityscapes | MIoU (16 classes) | 48.2 | Coarse-to-Fine(ResNet-101) |