Stochastic Conditional Diffusion Models for Robust Semantic Image Synthesis

Juyeon Ko, Inho Kong, Dogyun Park, Hyunwoo J. Kim

2024-02-26Noisy Semantic Image Synthesis Image Generation Conditional Image Generation Image-to-Image Translation

Abstract

Semantic image synthesis (SIS) is a task to generate realistic images corresponding to semantic maps (labels). However, in real-world applications, SIS often encounters noisy user inputs. To address this, we propose Stochastic Conditional Diffusion Model (SCDM), which is a robust conditional diffusion model that features novel forward and generation processes tailored for SIS with noisy labels. It enhances robustness by stochastically perturbing the semantic label maps through Label Diffusion, which diffuses the labels with discrete diffusion. Through the diffusion of labels, the noisy and clean semantic maps become similar as the timestep increases, eventually becoming identical at $t=T$. This facilitates the generation of an image close to a clean image, enabling robust generation. Furthermore, we propose a class-wise noise schedule to differentially diffuse the labels depending on the class. We demonstrate that the proposed method generates high-quality samples through extensive experiments and analyses on benchmark datasets, including a novel experimental setup simulating human errors during real-world applications. Code is available at https://github.com/mlvlab/SCDM.

Results

Task	Dataset	Metric	Value	Model
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	FID	15.3	SCDM
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	LPIPS	0.519	SCDM
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	mIoU	38.1	SCDM
Image-to-Image Translation	ADE20K Labels-to-Photos	FID	26.9	SCDM
Image-to-Image Translation	ADE20K Labels-to-Photos	LPIPS	0.53	SCDM
Image-to-Image Translation	ADE20K Labels-to-Photos	mIoU	49.4	SCDM
Image Generation	COCO-Stuff Labels-to-Photos	FID	15.3	SCDM
Image Generation	COCO-Stuff Labels-to-Photos	LPIPS	0.519	SCDM
Image Generation	COCO-Stuff Labels-to-Photos	mIoU	38.1	SCDM
Image Generation	ADE20K Labels-to-Photos	FID	26.9	SCDM
Image Generation	ADE20K Labels-to-Photos	LPIPS	0.53	SCDM
Image Generation	ADE20K Labels-to-Photos	mIoU	49.4	SCDM
Image Generation	CelebAMask-HQ	FID	17.4	SCDM
Image Generation	CelebAMask-HQ	LPIPS	0.418	SCDM
Image Generation	CelebAMask-HQ	mIoU	77.2	SCDM
Image Generation	noisy-ADE20K-Edge	FID	31.2	SCDM
Image Generation	noisy-ADE20K-Edge	mIoU	40.1	SCDM
Image Generation	noisy-ADE20K-DS	FID	32.4	SCDM
Image Generation	noisy-ADE20K-DS	mIoU	44.7	SCDM
Image Generation	noisy-ADE20K-Random	FID	28.1	SCDM
Image Generation	noisy-ADE20K-Random	mIoU	45.1	SCDM
Conditional Image Generation	CelebAMask-HQ	FID	17.4	SCDM
Conditional Image Generation	CelebAMask-HQ	LPIPS	0.418	SCDM
Conditional Image Generation	CelebAMask-HQ	mIoU	77.2	SCDM
Conditional Image Generation	noisy-ADE20K-Edge	FID	31.2	SCDM
Conditional Image Generation	noisy-ADE20K-Edge	mIoU	40.1	SCDM
Conditional Image Generation	noisy-ADE20K-DS	FID	32.4	SCDM
Conditional Image Generation	noisy-ADE20K-DS	mIoU	44.7	SCDM
Conditional Image Generation	noisy-ADE20K-Random	FID	28.1	SCDM
Conditional Image Generation	noisy-ADE20K-Random	mIoU	45.1	SCDM
1 Image, 2*2 Stitching	COCO-Stuff Labels-to-Photos	FID	15.3	SCDM
1 Image, 2*2 Stitching	COCO-Stuff Labels-to-Photos	LPIPS	0.519	SCDM
1 Image, 2*2 Stitching	COCO-Stuff Labels-to-Photos	mIoU	38.1	SCDM
1 Image, 2*2 Stitching	ADE20K Labels-to-Photos	FID	26.9	SCDM
1 Image, 2*2 Stitching	ADE20K Labels-to-Photos	LPIPS	0.53	SCDM
1 Image, 2*2 Stitching	ADE20K Labels-to-Photos	mIoU	49.4	SCDM

Stochastic Conditional Diffusion Models for Robust Semantic Image Synthesis

Abstract

Results

Related Papers

Stochastic Conditional Diffusion Models for Robust Semantic Image Synthesis

Abstract

Results

Related Papers