TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Smoothing Matters: Momentum Transformer for Domain Adaptiv...

Smoothing Matters: Momentum Transformer for Domain Adaptive Semantic Segmentation

Runfa Chen, Yu Rong, Shangmin Guo, Jiaqi Han, Fuchun Sun, Tingyang Xu, Wenbing Huang

2022-03-15SegmentationSemantic SegmentationSynthetic-to-Real TranslationUnsupervised Domain AdaptationImage-to-Image TranslationDomain Adaptation
PaperPDFCode(official)

Abstract

After the great success of Vision Transformer variants (ViTs) in computer vision, it has also demonstrated great potential in domain adaptive semantic segmentation. Unfortunately, straightforwardly applying local ViTs in domain adaptive semantic segmentation does not bring in expected improvement. We find that the pitfall of local ViTs is due to the severe high-frequency components generated during both the pseudo-label construction and features alignment for target domains. These high-frequency components make the training of local ViTs very unsmooth and hurt their transferability. In this paper, we introduce a low-pass filtering mechanism, momentum network, to smooth the learning dynamics of target domain features and pseudo labels. Furthermore, we propose a dynamic of discrepancy measurement to align the distributions in the source and target domains via dynamic weights to evaluate the importance of the samples. After tackling the above issues, extensive experiments on sim2real benchmarks show that the proposed method outperforms the state-of-the-art methods. Our codes are available at https://github.com/alpc91/TransDA

Results

TaskDatasetMetricValueModel
Image-to-Image TranslationSYNTHIA-to-CityscapesmIoU (13 classes)66.3TransDA-B
Image-to-Image TranslationGTAV-to-Cityscapes LabelsmIoU63.9TransDA-B
Image-to-Image TranslationGTAV-to-Cityscapes LabelsmIoU63.9TransDA-B
Image-to-Image TranslationSYNTHIA-to-CityscapesMIoU (13 classes)66.3TransDA-B
Image-to-Image TranslationSYNTHIA-to-CityscapesMIoU (16 classes)59.3TransDA-B
Domain AdaptationGTA5 to CityscapesmIoU63.9TransDA-B
Domain AdaptationGTAV-to-Cityscapes LabelsmIoU63.9TransDA-B
Domain AdaptationSYNTHIA-to-CityscapesmIoU (13 classes)66.3TransDA-B
Image GenerationSYNTHIA-to-CityscapesmIoU (13 classes)66.3TransDA-B
Image GenerationGTAV-to-Cityscapes LabelsmIoU63.9TransDA-B
Image GenerationGTAV-to-Cityscapes LabelsmIoU63.9TransDA-B
Image GenerationSYNTHIA-to-CityscapesMIoU (13 classes)66.3TransDA-B
Image GenerationSYNTHIA-to-CityscapesMIoU (16 classes)59.3TransDA-B
Semantic SegmentationGTAV-to-Cityscapes LabelsmIoU63.9TransDA-B
Semantic SegmentationSYNTHIA-to-CityscapesMean IoU59.3TransDA-B
Unsupervised Domain AdaptationGTAV-to-Cityscapes LabelsmIoU63.9TransDA-B
Unsupervised Domain AdaptationSYNTHIA-to-CityscapesmIoU (13 classes)66.3TransDA-B
10-shot image generationGTAV-to-Cityscapes LabelsmIoU63.9TransDA-B
10-shot image generationSYNTHIA-to-CityscapesMean IoU59.3TransDA-B
1 Image, 2*2 StitchingSYNTHIA-to-CityscapesmIoU (13 classes)66.3TransDA-B
1 Image, 2*2 StitchingGTAV-to-Cityscapes LabelsmIoU63.9TransDA-B
1 Image, 2*2 StitchingGTAV-to-Cityscapes LabelsmIoU63.9TransDA-B
1 Image, 2*2 StitchingSYNTHIA-to-CityscapesMIoU (13 classes)66.3TransDA-B
1 Image, 2*2 StitchingSYNTHIA-to-CityscapesMIoU (16 classes)59.3TransDA-B

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17