TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Stochastic Conditional Diffusion Models for Robust Semanti...

Stochastic Conditional Diffusion Models for Robust Semantic Image Synthesis

Juyeon Ko, Inho Kong, Dogyun Park, Hyunwoo J. Kim

2024-02-26Noisy Semantic Image SynthesisImage GenerationConditional Image GenerationImage-to-Image Translation
PaperPDFCode(official)

Abstract

Semantic image synthesis (SIS) is a task to generate realistic images corresponding to semantic maps (labels). However, in real-world applications, SIS often encounters noisy user inputs. To address this, we propose Stochastic Conditional Diffusion Model (SCDM), which is a robust conditional diffusion model that features novel forward and generation processes tailored for SIS with noisy labels. It enhances robustness by stochastically perturbing the semantic label maps through Label Diffusion, which diffuses the labels with discrete diffusion. Through the diffusion of labels, the noisy and clean semantic maps become similar as the timestep increases, eventually becoming identical at $t=T$. This facilitates the generation of an image close to a clean image, enabling robust generation. Furthermore, we propose a class-wise noise schedule to differentially diffuse the labels depending on the class. We demonstrate that the proposed method generates high-quality samples through extensive experiments and analyses on benchmark datasets, including a novel experimental setup simulating human errors during real-world applications. Code is available at https://github.com/mlvlab/SCDM.

Results

TaskDatasetMetricValueModel
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosFID15.3SCDM
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosLPIPS0.519SCDM
Image-to-Image TranslationCOCO-Stuff Labels-to-PhotosmIoU38.1SCDM
Image-to-Image TranslationADE20K Labels-to-PhotosFID26.9SCDM
Image-to-Image TranslationADE20K Labels-to-PhotosLPIPS0.53SCDM
Image-to-Image TranslationADE20K Labels-to-PhotosmIoU49.4SCDM
Image GenerationCOCO-Stuff Labels-to-PhotosFID15.3SCDM
Image GenerationCOCO-Stuff Labels-to-PhotosLPIPS0.519SCDM
Image GenerationCOCO-Stuff Labels-to-PhotosmIoU38.1SCDM
Image GenerationADE20K Labels-to-PhotosFID26.9SCDM
Image GenerationADE20K Labels-to-PhotosLPIPS0.53SCDM
Image GenerationADE20K Labels-to-PhotosmIoU49.4SCDM
Image GenerationCelebAMask-HQFID17.4SCDM
Image GenerationCelebAMask-HQLPIPS0.418SCDM
Image GenerationCelebAMask-HQmIoU77.2SCDM
Image Generationnoisy-ADE20K-EdgeFID31.2SCDM
Image Generationnoisy-ADE20K-EdgemIoU40.1SCDM
Image Generationnoisy-ADE20K-DSFID32.4SCDM
Image Generationnoisy-ADE20K-DSmIoU44.7SCDM
Image Generationnoisy-ADE20K-RandomFID28.1SCDM
Image Generationnoisy-ADE20K-RandommIoU45.1SCDM
Conditional Image GenerationCelebAMask-HQFID17.4SCDM
Conditional Image GenerationCelebAMask-HQLPIPS0.418SCDM
Conditional Image GenerationCelebAMask-HQmIoU77.2SCDM
Conditional Image Generationnoisy-ADE20K-EdgeFID31.2SCDM
Conditional Image Generationnoisy-ADE20K-EdgemIoU40.1SCDM
Conditional Image Generationnoisy-ADE20K-DSFID32.4SCDM
Conditional Image Generationnoisy-ADE20K-DSmIoU44.7SCDM
Conditional Image Generationnoisy-ADE20K-RandomFID28.1SCDM
Conditional Image Generationnoisy-ADE20K-RandommIoU45.1SCDM
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosFID15.3SCDM
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosLPIPS0.519SCDM
1 Image, 2*2 StitchingCOCO-Stuff Labels-to-PhotosmIoU38.1SCDM
1 Image, 2*2 StitchingADE20K Labels-to-PhotosFID26.9SCDM
1 Image, 2*2 StitchingADE20K Labels-to-PhotosLPIPS0.53SCDM
1 Image, 2*2 StitchingADE20K Labels-to-PhotosmIoU49.4SCDM

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17FADE: Adversarial Concept Erasure in Flow Models2025-07-16CharaConsist: Fine-Grained Consistent Character Generation2025-07-15CATVis: Context-Aware Thought Visualization2025-07-15