Image Super-Resolution via Iterative Refinement

Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J. Fleet, Mohammad Norouzi

2021-04-15Denoising Super-Resolution Image Super-Resolution Image Generation Conditional Image Generation

Abstract

We present SR3, an approach to image Super-Resolution via Repeated Refinement. SR3 adapts denoising diffusion probabilistic models to conditional image generation and performs super-resolution through a stochastic denoising process. Inference starts with pure Gaussian noise and iteratively refines the noisy output using a U-Net model trained on denoising at various noise levels. SR3 exhibits strong performance on super-resolution tasks at different magnification factors, on faces and natural images. We conduct human evaluation on a standard 8X face super-resolution task on CelebA-HQ, comparing with SOTA GAN methods. SR3 achieves a fool rate close to 50%, suggesting photo-realistic outputs, while GANs do not exceed a fool rate of 34%. We further show the effectiveness of SR3 in cascaded image generation, where generative models are chained with super-resolution models, yielding a competitive FID score of 11.3 on ImageNet.

Results

Task	Dataset	Metric	Value	Model
Super-Resolution	CelebA-HQ 128x128	Consistency	2.68	SR3
Super-Resolution	CelebA-HQ 128x128	PSNR	23.04	SR3
Super-Resolution	CelebA-HQ 128x128	SSIM	0.65	SR3
Image Super-Resolution	CelebA-HQ 128x128	Consistency	2.68	SR3
Image Super-Resolution	CelebA-HQ 128x128	PSNR	23.04	SR3
Image Super-Resolution	CelebA-HQ 128x128	SSIM	0.65	SR3
3D Object Super-Resolution	CelebA-HQ 128x128	Consistency	2.68	SR3
3D Object Super-Resolution	CelebA-HQ 128x128	PSNR	23.04	SR3
3D Object Super-Resolution	CelebA-HQ 128x128	SSIM	0.65	SR3
16k	CelebA-HQ 128x128	Consistency	2.68	SR3
16k	CelebA-HQ 128x128	PSNR	23.04	SR3
16k	CelebA-HQ 128x128	SSIM	0.65	SR3

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17 Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17 SpectraLift: Physics-Guided Spectral-Inversion Network for Self-Supervised Hyperspectral Image Super-Resolution2025-07-17 Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17 FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17 A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints2025-07-17 Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17 Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16