TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/StackGAN++: Realistic Image Synthesis with Stacked Generat...

StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas

2017-10-19Text-to-Image GenerationImage Generation
PaperPDFCodeCodeCodeCodeCodeCodeCodeCode(official)CodeCodeCodeCodeCodeCodeCodeCode(official)

Abstract

Although Generative Adversarial Networks (GANs) have shown remarkable success in various tasks, they still face challenges in generating high quality images. In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) aiming at generating high-resolution photo-realistic images. First, we propose a two-stage generative adversarial network architecture, StackGAN-v1, for text-to-image synthesis. The Stage-I GAN sketches the primitive shape and colors of the object based on given text description, yielding low-resolution images. The Stage-II GAN takes Stage-I results and text descriptions as inputs, and generates high-resolution images with photo-realistic details. Second, an advanced multi-stage generative adversarial network architecture, StackGAN-v2, is proposed for both conditional and unconditional generative tasks. Our StackGAN-v2 consists of multiple generators and discriminators in a tree-like structure; images at multiple scales corresponding to the same scene are generated from different branches of the tree. StackGAN-v2 shows more stable training behavior than StackGAN-v1 by jointly approximating multiple distributions. Extensive experiments demonstrate that the proposed stacked generative adversarial networks significantly outperform other state-of-the-art methods in generating photo-realistic images.

Results

TaskDatasetMetricValueModel
Image GenerationLSUN Bedroom 256 x 256FID35.61StackGAN-v2
Image GenerationCOCO (Common Objects in Context)FID74.05StackGAN-v1
Image GenerationCOCO (Common Objects in Context)Inception score8.45StackGAN-v1
Image GenerationOxford 102 FlowersFID48.68StackGAN-v2
Image GenerationOxford 102 FlowersInception score3.26StackGAN-v2
Image GenerationOxford 102 FlowersFID55.28StackGAN-v1
Image GenerationOxford 102 FlowersInception score3.2StackGAN-v1
Image GenerationCUBFID15.3StackGAN-v2
Image GenerationCUBInception score3.82StackGAN-v2
Image GenerationCUBFID51.89StackGAN-v1
Image GenerationCUBInception score3.7StackGAN-v1
Text-to-Image GenerationCOCO (Common Objects in Context)FID74.05StackGAN-v1
Text-to-Image GenerationCOCO (Common Objects in Context)Inception score8.45StackGAN-v1
Text-to-Image GenerationOxford 102 FlowersFID48.68StackGAN-v2
Text-to-Image GenerationOxford 102 FlowersInception score3.26StackGAN-v2
Text-to-Image GenerationOxford 102 FlowersFID55.28StackGAN-v1
Text-to-Image GenerationOxford 102 FlowersInception score3.2StackGAN-v1
Text-to-Image GenerationCUBFID15.3StackGAN-v2
Text-to-Image GenerationCUBInception score3.82StackGAN-v2
Text-to-Image GenerationCUBFID51.89StackGAN-v1
Text-to-Image GenerationCUBInception score3.7StackGAN-v1
10-shot image generationCOCO (Common Objects in Context)FID74.05StackGAN-v1
10-shot image generationCOCO (Common Objects in Context)Inception score8.45StackGAN-v1
10-shot image generationOxford 102 FlowersFID48.68StackGAN-v2
10-shot image generationOxford 102 FlowersInception score3.26StackGAN-v2
10-shot image generationOxford 102 FlowersFID55.28StackGAN-v1
10-shot image generationOxford 102 FlowersInception score3.2StackGAN-v1
10-shot image generationCUBFID15.3StackGAN-v2
10-shot image generationCUBInception score3.82StackGAN-v2
10-shot image generationCUBFID51.89StackGAN-v1
10-shot image generationCUBInception score3.7StackGAN-v1
1 Image, 2*2 StitchiCOCO (Common Objects in Context)FID74.05StackGAN-v1
1 Image, 2*2 StitchiCOCO (Common Objects in Context)Inception score8.45StackGAN-v1
1 Image, 2*2 StitchiOxford 102 FlowersFID48.68StackGAN-v2
1 Image, 2*2 StitchiOxford 102 FlowersInception score3.26StackGAN-v2
1 Image, 2*2 StitchiOxford 102 FlowersFID55.28StackGAN-v1
1 Image, 2*2 StitchiOxford 102 FlowersInception score3.2StackGAN-v1
1 Image, 2*2 StitchiCUBFID15.3StackGAN-v2
1 Image, 2*2 StitchiCUBInception score3.82StackGAN-v2
1 Image, 2*2 StitchiCUBFID51.89StackGAN-v1
1 Image, 2*2 StitchiCUBInception score3.7StackGAN-v1

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17FADE: Adversarial Concept Erasure in Flow Models2025-07-16CharaConsist: Fine-Grained Consistent Character Generation2025-07-15CATVis: Context-Aware Thought Visualization2025-07-15