TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/You Only Need Adversarial Supervision for Semantic Image S...

You Only Need Adversarial Supervision for Semantic Image Synthesis

Vadim Sushko, Edgar Schönfeld, Dan Zhang, Juergen Gall, Bernt Schiele, Anna Khoreva

2020-12-08ICLR 2021 1Semantic SegmentationImage GenerationImage-to-Image Translation
PaperPDFCode(official)

Abstract

Despite their recent successes, GAN models for semantic image synthesis still suffer from poor image quality when trained with only adversarial supervision. Historically, additionally employing the VGG-based perceptual loss has helped to overcome this issue, significantly improving the synthesis quality, but at the same time limiting the progress of GAN models for semantic image synthesis. In this work, we propose a novel, simplified GAN model, which needs only adversarial supervision to achieve high quality results. We re-design the discriminator as a semantic segmentation network, directly using the given semantic label maps as the ground truth for training. By providing stronger supervision to the discriminator as well as to the generator through spatially- and semantically-aware discriminator feedback, we are able to synthesize images of higher fidelity with better alignment to their input label maps, making the use of the perceptual loss superfluous. Moreover, we enable high-quality multi-modal image synthesis through global and local sampling of a 3D noise tensor injected into the generator, which allows complete or partial image change. We show that images synthesized by our model are more diverse and follow the color and texture distributions of real images more closely. We achieve an average improvement of $6$ FID and $5$ mIoU points over the state of the art across different datasets using only adversarial supervision.

Results

TaskDatasetMetricValueModel
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosFID17OASIS
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosmIoU44.1OASIS
Image-to-Image TranslationCityscapes Labels-to-PhotoFID47.7OASIS
Image-to-Image TranslationCityscapes Labels-to-PhotoLPIPS0.275OASIS
Image-to-Image TranslationCityscapes Labels-to-PhotomIoU69.3OASIS
Image-to-Image TranslationADE20K Labels-to-PhotosFID28.3OASIS
Image-to-Image TranslationADE20K Labels-to-PhotosLPIPS0.265OASIS
Image-to-Image TranslationADE20K Labels-to-PhotosmIoU48.8OASIS
Image-to-Image TranslationADE20K-Outdoor Labels-to-PhotosFID48.6OASIS
Image-to-Image TranslationADE20K-Outdoor Labels-to-PhotosmIoU40.4OASIS
Image GenerationCOCO-Stuff Labels-to-PhotosFID17OASIS
Image GenerationCOCO-Stuff Labels-to-PhotosmIoU44.1OASIS
Image GenerationCityscapes Labels-to-PhotoFID47.7OASIS
Image GenerationCityscapes Labels-to-PhotoLPIPS0.275OASIS
Image GenerationCityscapes Labels-to-PhotomIoU69.3OASIS
Image GenerationADE20K Labels-to-PhotosFID28.3OASIS
Image GenerationADE20K Labels-to-PhotosLPIPS0.265OASIS
Image GenerationADE20K Labels-to-PhotosmIoU48.8OASIS
Image GenerationADE20K-Outdoor Labels-to-PhotosFID48.6OASIS
Image GenerationADE20K-Outdoor Labels-to-PhotosmIoU40.4OASIS
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosFID17OASIS
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosmIoU44.1OASIS
1 Image, 2*2 StitchingCityscapes Labels-to-PhotoFID47.7OASIS
1 Image, 2*2 StitchingCityscapes Labels-to-PhotoLPIPS0.275OASIS
1 Image, 2*2 StitchingCityscapes Labels-to-PhotomIoU69.3OASIS
1 Image, 2*2 StitchingADE20K Labels-to-PhotosFID28.3OASIS
1 Image, 2*2 StitchingADE20K Labels-to-PhotosLPIPS0.265OASIS
1 Image, 2*2 StitchingADE20K Labels-to-PhotosmIoU48.8OASIS
1 Image, 2*2 StitchingADE20K-Outdoor Labels-to-PhotosFID48.6OASIS
1 Image, 2*2 StitchingADE20K-Outdoor Labels-to-PhotosmIoU40.4OASIS

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17