Analyzing and Improving the Image Quality of StyleGAN

Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, Timo Aila

2019-12-03CVPR 2020 6Attribute Image Generation Conditional Image Generation

Abstract

The style-based GAN architecture (StyleGAN) yields state-of-the-art results in data-driven unconditional generative image modeling. We expose and analyze several of its characteristic artifacts, and propose changes in both model architecture and training methods to address them. In particular, we redesign the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images. In addition to improving image quality, this path length regularizer yields the additional benefit that the generator becomes significantly easier to invert. This makes it possible to reliably attribute a generated image to a particular network. We furthermore visualize how well the generator utilizes its output resolution, and identify a capacity problem, motivating us to train larger models for additional quality improvements. Overall, our improved model redefines the state of the art in unconditional image modeling, both in terms of existing distribution quality metrics as well as perceived image quality.

Results

Task	Dataset	Metric	Value	Model
Image Generation	LSUN Car 256 x 256	FID	2.32	StyleGAN2
Image Generation	LSUN Cat 256 x 256	FID	6.93	StyleGAN2
Image Generation	LSUN Horse 256 x 256	FID	3.43	StyleGAN2
Image Generation	FFHQ	FID	2.84	StyleGAN2
Image Generation	FFHQ 1024 x 1024	FID	2.84	StyleGAN2
Image Generation	LSUN Car 512 x 384	FID	2.32	StyleGAN2
Image Generation	LSUN Churches 256 x 256	FID	3.86	StyleGAN2
Image Generation	ArtBench-10 (32x32)	FID	4.491	StyleGAN2
Conditional Image Generation	ArtBench-10 (32x32)	FID	4.491	StyleGAN2

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17 Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17 FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17 A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints2025-07-17 Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17 MGFFD-VLM: Multi-Granularity Prompt Learning for Face Forgery Detection with VLM2025-07-16 Non-Adaptive Adversarial Face Generation2025-07-16 FADE: Adversarial Concept Erasure in Flow Models2025-07-16