Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models

Dongjun Kim, Yeongmin Kim, Se Jung Kwon, Wanmo Kang, Il-Chul Moon

2022-11-28Denoising Image Generation Conditional Image Generation

Abstract

The proposed method, Discriminator Guidance, aims to improve sample generation of pre-trained diffusion models. The approach introduces a discriminator that gives explicit supervision to a denoising sample path whether it is realistic or not. Unlike GANs, our approach does not require joint training of score and discriminator networks. Instead, we train the discriminator after score training, making discriminator training stable and fast to converge. In sample generation, we add an auxiliary term to the pre-trained score to deceive the discriminator. This term corrects the model score to the data score at the optimal discriminator, which implies that the discriminator helps better score estimation in a complementary way. Using our algorithm, we achive state-of-the-art results on ImageNet 256x256 with FID 1.83 and recall 0.64, similar to the validation data's FID (1.68) and recall (0.66). We release the code at https://github.com/alsdudrla10/DG.

Results

Task	Dataset	Metric	Value	Model
Image Generation	CelebA 64x64	FID	1.34	STDDPM-G++
Image Generation	CIFAR-10	FID	1.77	Discriminator Guidance (unconditional)
Image Generation	ImageNet 256x256	FID	1.83	Discriminator Guidance
Image Generation	ImageNet 256x256	FID	3.18	ADM-G++ (FID)
Image Generation	ImageNet 256x256	FID	4.45	ADM-G++ (Recall)
Image Generation	CIFAR-10	FID	1.64	EDM-G++ (conditional)
Conditional Image Generation	CIFAR-10	FID	1.64	EDM-G++ (conditional)

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17 Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17 Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17 FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17 A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints2025-07-17 Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17 Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16 FADE: Adversarial Concept Erasure in Flow Models2025-07-16