TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/MIC: Masked Image Consistency for Context-Enhanced Domain ...

MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation

Lukas Hoyer, Dengxin Dai, Haoran Wang, Luc van Gool

2022-12-02CVPR 2023 1Image ClassificationSemantic SegmentationSynthetic-to-Real TranslationUnsupervised Domain Adaptationobject-detectionObject DetectionImage-to-Image TranslationDomain Adaptation
PaperPDFCode(official)

Abstract

In unsupervised domain adaptation (UDA), a model trained on source data (e.g. synthetic) is adapted to target data (e.g. real-world) without access to target annotation. Most previous UDA methods struggle with classes that have a similar visual appearance on the target domain as no ground truth is available to learn the slight appearance differences. To address this problem, we propose a Masked Image Consistency (MIC) module to enhance UDA by learning spatial context relations of the target domain as additional clues for robust visual recognition. MIC enforces the consistency between predictions of masked target images, where random patches are withheld, and pseudo-labels that are generated based on the complete image by an exponential moving average teacher. To minimize the consistency loss, the network has to learn to infer the predictions of the masked regions from their context. Due to its simple and universal concept, MIC can be integrated into various UDA methods across different visual recognition tasks such as image classification, semantic segmentation, and object detection. MIC significantly improves the state-of-the-art performance across the different recognition tasks for synthetic-to-real, day-to-nighttime, and clear-to-adverse-weather UDA. For instance, MIC achieves an unprecedented UDA performance of 75.9 mIoU and 92.8% on GTA-to-Cityscapes and VisDA-2017, respectively, which corresponds to an improvement of +2.1 and +3.0 percent points over the previous state of the art. The implementation is available at https://github.com/lhoyer/MIC.

Results

TaskDatasetMetricValueModel
Image-to-Image TranslationCityscapes-to-Foggy CityscapesmAP47.6MIC
Image-to-Image TranslationSYNTHIA-to-CityscapesmIoU (13 classes)74MIC
Image-to-Image TranslationGTAV-to-Cityscapes LabelsmIoU75.9MIC
Image-to-Image TranslationGTAV-to-Cityscapes LabelsmIoU75.9HRDA+MIC
Image-to-Image TranslationSYNTHIA-to-CityscapesMIoU (13 classes)74MIC
Image-to-Image TranslationSYNTHIA-to-CityscapesMIoU (16 classes)67.3MIC
Domain AdaptationSYNTHIA-to-CityscapesmIoU67.3MIC
Domain AdaptationGTA5 to CityscapesmIoU75.9MIC
Domain AdaptationCityscapes to ACDCmIoU70.4MIC
Domain AdaptationVisDA2017Accuracy92.8MIC
Domain AdaptationOffice-HomeAccuracy86.2MIC
Domain AdaptationGTAV-to-Cityscapes LabelsmIoU75.9MIC
Domain AdaptationCityscapes to Foggy CityscapesmAP@0.547.6MIC
Domain AdaptationSYNTHIA-to-CityscapesmIoU67.3MIC
Domain AdaptationSYNTHIA-to-CityscapesmIoU (13 classes)74MIC
Image GenerationCityscapes-to-Foggy CityscapesmAP47.6MIC
Image GenerationSYNTHIA-to-CityscapesmIoU (13 classes)74MIC
Image GenerationGTAV-to-Cityscapes LabelsmIoU75.9MIC
Image GenerationGTAV-to-Cityscapes LabelsmIoU75.9HRDA+MIC
Image GenerationSYNTHIA-to-CityscapesMIoU (13 classes)74MIC
Image GenerationSYNTHIA-to-CityscapesMIoU (16 classes)67.3MIC
Semantic SegmentationDark ZurichmIoU60.2MIC
Semantic SegmentationGTAV-to-Cityscapes LabelsmIoU75.9MIC
Semantic SegmentationSYNTHIA-to-CityscapesMean IoU67.3MIC
Unsupervised Domain AdaptationGTAV-to-Cityscapes LabelsmIoU75.9MIC
Unsupervised Domain AdaptationCityscapes to Foggy CityscapesmAP@0.547.6MIC
Unsupervised Domain AdaptationSYNTHIA-to-CityscapesmIoU67.3MIC
Unsupervised Domain AdaptationSYNTHIA-to-CityscapesmIoU (13 classes)74MIC
10-shot image generationDark ZurichmIoU60.2MIC
10-shot image generationGTAV-to-Cityscapes LabelsmIoU75.9MIC
10-shot image generationSYNTHIA-to-CityscapesMean IoU67.3MIC
1 Image, 2*2 StitchingCityscapes-to-Foggy CityscapesmAP47.6MIC
1 Image, 2*2 StitchingSYNTHIA-to-CityscapesmIoU (13 classes)74MIC
1 Image, 2*2 StitchingGTAV-to-Cityscapes LabelsmIoU75.9MIC
1 Image, 2*2 StitchingGTAV-to-Cityscapes LabelsmIoU75.9HRDA+MIC
1 Image, 2*2 StitchingSYNTHIA-to-CityscapesMIoU (13 classes)74MIC
1 Image, 2*2 StitchingSYNTHIA-to-CityscapesMIoU (16 classes)67.3MIC

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Automatic Classification and Segmentation of Tunnel Cracks Based on Deep Learning and Visual Explanations2025-07-18Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17Federated Learning for Commercial Image Sources2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17