TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/GANs N' Roses: Stable, Controllable, Diverse Image to Imag...

GANs N' Roses: Stable, Controllable, Diverse Image to Image Translation (works for videos too!)

Min Jin Chong, David Forsyth

2021-06-11multimodal generationTranslationImage-to-Image Translation
PaperPDFCode(official)Code

Abstract

We show how to learn a map that takes a content code, derived from a face image, and a randomly chosen style code to an anime image. We derive an adversarial loss from our simple and effective definitions of style and content. This adversarial loss guarantees the map is diverse -- a very wide range of anime can be produced from a single content code. Under plausible assumptions, the map is not just diverse, but also correctly represents the probability of an anime, conditioned on an input face. In contrast, current multimodal generation procedures cannot capture the complex styles that appear in anime. Extensive quantitative experiments support the idea the map is correct. Extensive qualitative results show that the method can generate a much more diverse range of styles than SOTA comparisons. Finally, we show that our formalization of content and style allows us to perform video to video translation without ever training on videos.

Results

TaskDatasetMetricValueModel
Image-to-Image Translationcat2dogDFID26.1GNR
Image-to-Image Translationcat2dogFID26.9GNR
Image-to-Image Translationcat2dogDFID53.6StarGANv2
Image-to-Image Translationcat2dogFID44.2StarGANv2
Image-to-Image Translationcat2dogDFID160.1DRIT++
Image-to-Image Translationcat2dogFID91.5DRIT++
Image-to-Image Translationcat2dogDFID172.5CouncilGAN
Image-to-Image Translationcat2dogFID90.8CouncilGAN
Image-to-Image Translationselfie2animeDFID35.6GNR
Image-to-Image Translationselfie2animeFID34.4GNR
Image-to-Image Translationselfie2animeLPIPS0.505GNR
Image-to-Image Translationselfie2animeDFID56.2CouncilGAN
Image-to-Image Translationselfie2animeFID38.1CouncilGAN
Image-to-Image Translationselfie2animeLPIPS0.43CouncilGAN
Image-to-Image Translationselfie2animeDFID83StarGANv2
Image-to-Image Translationselfie2animeFID59.8StarGANv2
Image-to-Image Translationselfie2animeLPIPS0.427StarGANv2
Image-to-Image Translationselfie2animeDFID94.6DRIT++
Image-to-Image Translationselfie2animeFID63.8DRIT++
Image-to-Image Translationselfie2animeLPIPS0.201DRIT++
Image Generationcat2dogDFID26.1GNR
Image Generationcat2dogFID26.9GNR
Image Generationcat2dogDFID53.6StarGANv2
Image Generationcat2dogFID44.2StarGANv2
Image Generationcat2dogDFID160.1DRIT++
Image Generationcat2dogFID91.5DRIT++
Image Generationcat2dogDFID172.5CouncilGAN
Image Generationcat2dogFID90.8CouncilGAN
Image Generationselfie2animeDFID35.6GNR
Image Generationselfie2animeFID34.4GNR
Image Generationselfie2animeLPIPS0.505GNR
Image Generationselfie2animeDFID56.2CouncilGAN
Image Generationselfie2animeFID38.1CouncilGAN
Image Generationselfie2animeLPIPS0.43CouncilGAN
Image Generationselfie2animeDFID83StarGANv2
Image Generationselfie2animeFID59.8StarGANv2
Image Generationselfie2animeLPIPS0.427StarGANv2
Image Generationselfie2animeDFID94.6DRIT++
Image Generationselfie2animeFID63.8DRIT++
Image Generationselfie2animeLPIPS0.201DRIT++
1 Image, 2*2 Stitchingcat2dogDFID26.1GNR
1 Image, 2*2 Stitchingcat2dogFID26.9GNR
1 Image, 2*2 Stitchingcat2dogDFID53.6StarGANv2
1 Image, 2*2 Stitchingcat2dogFID44.2StarGANv2
1 Image, 2*2 Stitchingcat2dogDFID160.1DRIT++
1 Image, 2*2 Stitchingcat2dogFID91.5DRIT++
1 Image, 2*2 Stitchingcat2dogDFID172.5CouncilGAN
1 Image, 2*2 Stitchingcat2dogFID90.8CouncilGAN
1 Image, 2*2 Stitchingselfie2animeDFID35.6GNR
1 Image, 2*2 Stitchingselfie2animeFID34.4GNR
1 Image, 2*2 Stitchingselfie2animeLPIPS0.505GNR
1 Image, 2*2 Stitchingselfie2animeDFID56.2CouncilGAN
1 Image, 2*2 Stitchingselfie2animeFID38.1CouncilGAN
1 Image, 2*2 Stitchingselfie2animeLPIPS0.43CouncilGAN
1 Image, 2*2 Stitchingselfie2animeDFID83StarGANv2
1 Image, 2*2 Stitchingselfie2animeFID59.8StarGANv2
1 Image, 2*2 Stitchingselfie2animeLPIPS0.427StarGANv2
1 Image, 2*2 Stitchingselfie2animeDFID94.6DRIT++
1 Image, 2*2 Stitchingselfie2animeFID63.8DRIT++
1 Image, 2*2 Stitchingselfie2animeLPIPS0.201DRIT++

Related Papers

A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17Function-to-Style Guidance of LLMs for Code Translation2025-07-15Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation2025-07-09Pun Intended: Multi-Agent Translation of Wordplay with Contrastive Learning and Phonetic-Semantic Embeddings2025-07-09Unconditional Diffusion for Generative Sequential Recommendation2025-07-08GRAFT: A Graph-based Flow-aware Agentic Framework for Document-level Machine Translation2025-07-04TransLaw: Benchmarking Large Language Models in Multi-Agent Simulation of the Collaborative Translation2025-07-01CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation2025-06-29