TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Unified Generative Adversarial Networks for Controllable I...

Unified Generative Adversarial Networks for Controllable Image-to-Image Translation

Hao Tang, Hong Liu, Nicu Sebe

2019-12-12Facial Expression TranslationTranslationImage GenerationImage-to-Image Translation
PaperPDFCode(official)

Abstract

We propose a unified Generative Adversarial Network (GAN) for controllable image-to-image translation, i.e., transferring an image from a source to a target domain guided by controllable structures. In addition to conditioning on a reference image, we show how the model can generate images conditioned on controllable structures, e.g., class labels, object keypoints, human skeletons, and scene semantic maps. The proposed model consists of a single generator and a discriminator taking a conditional image and the target controllable structure as input. In this way, the conditional image can provide appearance information and the controllable structure can provide the structure information for generating the target result. Moreover, our model learns the image-to-image mapping through three novel losses, i.e., color loss, controllable structure guided cycle-consistency loss, and controllable structure guided self-content preserving loss. Also, we present the Fr\'echet ResNet Distance (FRD) to evaluate the quality of the generated images. Experiments on two challenging image translation tasks, i.e., hand gesture-to-gesture translation and cross-view image translation, show that our model generates convincing results, and significantly outperforms other state-of-the-art methods on both tasks. Meanwhile, the proposed framework is a unified solution, thus it can be applied to solving other controllable structure guided image translation tasks such as landmark guided facial expression translation and keypoint guided person image generation. To the best of our knowledge, we are the first to make one GAN framework work on all such controllable structure guided image translation tasks. Code is available at https://github.com/Ha0Tang/GestureGAN.

Results

TaskDatasetMetricValueModel
Image-to-Image TranslationDayton (64x64) - ground-to-aerialLPIPS0.4527UniGAN
Image-to-Image TranslationcvusaKL2.6UniGAN
Image-to-Image TranslationcvusaPSNR22.8223UniGAN
Image-to-Image TranslationcvusaSD19.8276UniGAN
Image-to-Image TranslationcvusaSSIM0.5366UniGAN
Image-to-Image TranslationDayton (64×64) - aerial-to-groundKL2.16UniGAN
Image-to-Image TranslationDayton (64×64) - aerial-to-groundLPIPS0.3817UniGAN
Image-to-Image TranslationDayton (64×64) - aerial-to-groundPSNR23.3632UniGAN
Image-to-Image TranslationDayton (64×64) - aerial-to-groundSD16.4788UniGAN
Image-to-Image TranslationDayton (64×64) - aerial-to-groundSSIM0.5064UniGAN
Image-to-Image TranslationDayton (256×256) - aerial-to-groundKL5.17UniGAN
Image-to-Image TranslationDayton (256×256) - aerial-to-groundPSNR22.0273UniGAN
Image-to-Image TranslationDayton (256×256) - aerial-to-groundSD17.6542UniGAN
Image-to-Image TranslationDayton (256×256) - aerial-to-groundSSIM0.3357UniGAN
Image GenerationDayton (64x64) - ground-to-aerialLPIPS0.4527UniGAN
Image GenerationcvusaKL2.6UniGAN
Image GenerationcvusaPSNR22.8223UniGAN
Image GenerationcvusaSD19.8276UniGAN
Image GenerationcvusaSSIM0.5366UniGAN
Image GenerationDayton (64×64) - aerial-to-groundKL2.16UniGAN
Image GenerationDayton (64×64) - aerial-to-groundLPIPS0.3817UniGAN
Image GenerationDayton (64×64) - aerial-to-groundPSNR23.3632UniGAN
Image GenerationDayton (64×64) - aerial-to-groundSD16.4788UniGAN
Image GenerationDayton (64×64) - aerial-to-groundSSIM0.5064UniGAN
Image GenerationDayton (256×256) - aerial-to-groundKL5.17UniGAN
Image GenerationDayton (256×256) - aerial-to-groundPSNR22.0273UniGAN
Image GenerationDayton (256×256) - aerial-to-groundSD17.6542UniGAN
Image GenerationDayton (256×256) - aerial-to-groundSSIM0.3357UniGAN
HandNTU Hand DigitAMT29.3UniGAN
HandNTU Hand DigitFID6.7493UniGAN
HandNTU Hand DigitFRD1.7401UniGAN
HandNTU Hand DigitIS2.3783UniGAN
HandNTU Hand DigitPSNR32.6574UniGAN
HandSenz3DAMT27.6UniGAN
HandSenz3DFID12.4465UniGAN
HandSenz3DFRD2.2104UniGAN
HandSenz3DIS2.2159UniGAN
HandSenz3DPSNR31.542UniGAN
1 Image, 2*2 StitchingDayton (64x64) - ground-to-aerialLPIPS0.4527UniGAN
1 Image, 2*2 StitchingcvusaKL2.6UniGAN
1 Image, 2*2 StitchingcvusaPSNR22.8223UniGAN
1 Image, 2*2 StitchingcvusaSD19.8276UniGAN
1 Image, 2*2 StitchingcvusaSSIM0.5366UniGAN
1 Image, 2*2 StitchingDayton (64×64) - aerial-to-groundKL2.16UniGAN
1 Image, 2*2 StitchingDayton (64×64) - aerial-to-groundLPIPS0.3817UniGAN
1 Image, 2*2 StitchingDayton (64×64) - aerial-to-groundPSNR23.3632UniGAN
1 Image, 2*2 StitchingDayton (64×64) - aerial-to-groundSD16.4788UniGAN
1 Image, 2*2 StitchingDayton (64×64) - aerial-to-groundSSIM0.5064UniGAN
1 Image, 2*2 StitchingDayton (256×256) - aerial-to-groundKL5.17UniGAN
1 Image, 2*2 StitchingDayton (256×256) - aerial-to-groundPSNR22.0273UniGAN
1 Image, 2*2 StitchingDayton (256×256) - aerial-to-groundSD17.6542UniGAN
1 Image, 2*2 StitchingDayton (256×256) - aerial-to-groundSSIM0.3357UniGAN

Related Papers

A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17FADE: Adversarial Concept Erasure in Flow Models2025-07-16Function-to-Style Guidance of LLMs for Code Translation2025-07-15