TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/DAFormer: Improving Network Architectures and Training Str...

DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation

Lukas Hoyer, Dengxin Dai, Luc van Gool

2021-11-29CVPR 2022 1Semantic SegmentationSynthetic-to-Real TranslationUnsupervised Domain AdaptationImage-to-Image TranslationDomain Adaptation
PaperPDFCode(official)CodeCode

Abstract

As acquiring pixel-wise annotations of real-world images for semantic segmentation is a costly process, a model can instead be trained with more accessible synthetic data and adapted to real images without requiring their annotations. This process is studied in unsupervised domain adaptation (UDA). Even though a large number of methods propose new adaptation strategies, they are mostly based on outdated network architectures. As the influence of recent network architectures has not been systematically studied, we first benchmark different network architectures for UDA and newly reveal the potential of Transformers for UDA semantic segmentation. Based on the findings, we propose a novel UDA method, DAFormer. The network architecture of DAFormer consists of a Transformer encoder and a multi-level context-aware feature fusion decoder. It is enabled by three simple but crucial training strategies to stabilize the training and to avoid overfitting to the source domain: While (1) Rare Class Sampling on the source domain improves the quality of the pseudo-labels by mitigating the confirmation bias of self-training toward common classes, (2) a Thing-Class ImageNet Feature Distance and (3) a learning rate warmup promote feature transfer from ImageNet pretraining. DAFormer represents a major advance in UDA. It improves the state of the art by 10.8 mIoU for GTA-to-Cityscapes and 5.4 mIoU for Synthia-to-Cityscapes and enables learning even difficult classes such as train, bus, and truck well. The implementation is available at https://github.com/lhoyer/DAFormer.

Results

TaskDatasetMetricValueModel
Image-to-Image TranslationSYNTHIA-to-CityscapesmIoU (13 classes)67.4DAFormer
Image-to-Image TranslationGTAV-to-Cityscapes LabelsmIoU68.3DAFormer
Image-to-Image TranslationGTAV-to-Cityscapes LabelsmIoU68.3DAFormer
Image-to-Image TranslationSYNTHIA-to-CityscapesMIoU (13 classes)67.4DAFormer
Image-to-Image TranslationSYNTHIA-to-CityscapesMIoU (16 classes)60.9DAFormer
Domain AdaptationSYNTHIA-to-CityscapesmIoU60.9DAFormer
Domain AdaptationGTA5 to CityscapesmIoU68.3DAFormer
Domain AdaptationCityscapes to ACDCmIoU55.4DAFormer
Domain AdaptationGTAV-to-Cityscapes LabelsmIoU68.3DAFormer
Domain AdaptationSYNTHIA-to-CityscapesmIoU60.9DAFormer
Domain AdaptationSYNTHIA-to-CityscapesmIoU (13 classes)67.4DAFormer
Image GenerationSYNTHIA-to-CityscapesmIoU (13 classes)67.4DAFormer
Image GenerationGTAV-to-Cityscapes LabelsmIoU68.3DAFormer
Image GenerationGTAV-to-Cityscapes LabelsmIoU68.3DAFormer
Image GenerationSYNTHIA-to-CityscapesMIoU (13 classes)67.4DAFormer
Image GenerationSYNTHIA-to-CityscapesMIoU (16 classes)60.9DAFormer
Semantic SegmentationDark ZurichmIoU53.8DAFormer
Semantic SegmentationGTAV-to-Cityscapes LabelsmIoU68.3DAFormer
Semantic SegmentationSYNTHIA-to-CityscapesMean IoU60.9DAFormer
Unsupervised Domain AdaptationGTAV-to-Cityscapes LabelsmIoU68.3DAFormer
Unsupervised Domain AdaptationSYNTHIA-to-CityscapesmIoU60.9DAFormer
Unsupervised Domain AdaptationSYNTHIA-to-CityscapesmIoU (13 classes)67.4DAFormer
10-shot image generationDark ZurichmIoU53.8DAFormer
10-shot image generationGTAV-to-Cityscapes LabelsmIoU68.3DAFormer
10-shot image generationSYNTHIA-to-CityscapesMean IoU60.9DAFormer
1 Image, 2*2 StitchingSYNTHIA-to-CityscapesmIoU (13 classes)67.4DAFormer
1 Image, 2*2 StitchingGTAV-to-Cityscapes LabelsmIoU68.3DAFormer
1 Image, 2*2 StitchingGTAV-to-Cityscapes LabelsmIoU68.3DAFormer
1 Image, 2*2 StitchingSYNTHIA-to-CityscapesMIoU (13 classes)67.4DAFormer
1 Image, 2*2 StitchingSYNTHIA-to-CityscapesMIoU (16 classes)60.9DAFormer

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17SAMST: A Transformer framework based on SAM pseudo label filtering for remote sensing semi-supervised semantic segmentation2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15U-RWKV: Lightweight medical image segmentation with direction-adaptive RWKV2025-07-15