GANs N' Roses: Stable, Controllable, Diverse Image to Image Translation (works for videos too!)

Min Jin Chong, David Forsyth

2021-06-11multimodal generation Translation Image-to-Image Translation

Abstract

We show how to learn a map that takes a content code, derived from a face image, and a randomly chosen style code to an anime image. We derive an adversarial loss from our simple and effective definitions of style and content. This adversarial loss guarantees the map is diverse -- a very wide range of anime can be produced from a single content code. Under plausible assumptions, the map is not just diverse, but also correctly represents the probability of an anime, conditioned on an input face. In contrast, current multimodal generation procedures cannot capture the complex styles that appear in anime. Extensive quantitative experiments support the idea the map is correct. Extensive qualitative results show that the method can generate a much more diverse range of styles than SOTA comparisons. Finally, we show that our formalization of content and style allows us to perform video to video translation without ever training on videos.

Results

Task	Dataset	Metric	Value	Model
Image-to-Image Translation	cat2dog	DFID	26.1	GNR
Image-to-Image Translation	cat2dog	FID	26.9	GNR
Image-to-Image Translation	cat2dog	DFID	53.6	StarGANv2
Image-to-Image Translation	cat2dog	FID	44.2	StarGANv2
Image-to-Image Translation	cat2dog	DFID	160.1	DRIT++
Image-to-Image Translation	cat2dog	FID	91.5	DRIT++
Image-to-Image Translation	cat2dog	DFID	172.5	CouncilGAN
Image-to-Image Translation	cat2dog	FID	90.8	CouncilGAN
Image-to-Image Translation	selfie2anime	DFID	35.6	GNR
Image-to-Image Translation	selfie2anime	FID	34.4	GNR
Image-to-Image Translation	selfie2anime	LPIPS	0.505	GNR
Image-to-Image Translation	selfie2anime	DFID	56.2	CouncilGAN
Image-to-Image Translation	selfie2anime	FID	38.1	CouncilGAN
Image-to-Image Translation	selfie2anime	LPIPS	0.43	CouncilGAN
Image-to-Image Translation	selfie2anime	DFID	83	StarGANv2
Image-to-Image Translation	selfie2anime	FID	59.8	StarGANv2
Image-to-Image Translation	selfie2anime	LPIPS	0.427	StarGANv2
Image-to-Image Translation	selfie2anime	DFID	94.6	DRIT++
Image-to-Image Translation	selfie2anime	FID	63.8	DRIT++
Image-to-Image Translation	selfie2anime	LPIPS	0.201	DRIT++
Image Generation	cat2dog	DFID	26.1	GNR
Image Generation	cat2dog	FID	26.9	GNR
Image Generation	cat2dog	DFID	53.6	StarGANv2
Image Generation	cat2dog	FID	44.2	StarGANv2
Image Generation	cat2dog	DFID	160.1	DRIT++
Image Generation	cat2dog	FID	91.5	DRIT++
Image Generation	cat2dog	DFID	172.5	CouncilGAN
Image Generation	cat2dog	FID	90.8	CouncilGAN
Image Generation	selfie2anime	DFID	35.6	GNR
Image Generation	selfie2anime	FID	34.4	GNR
Image Generation	selfie2anime	LPIPS	0.505	GNR
Image Generation	selfie2anime	DFID	56.2	CouncilGAN
Image Generation	selfie2anime	FID	38.1	CouncilGAN
Image Generation	selfie2anime	LPIPS	0.43	CouncilGAN
Image Generation	selfie2anime	DFID	83	StarGANv2
Image Generation	selfie2anime	FID	59.8	StarGANv2
Image Generation	selfie2anime	LPIPS	0.427	StarGANv2
Image Generation	selfie2anime	DFID	94.6	DRIT++
Image Generation	selfie2anime	FID	63.8	DRIT++
Image Generation	selfie2anime	LPIPS	0.201	DRIT++
1 Image, 2*2 Stitching	cat2dog	DFID	26.1	GNR
1 Image, 2*2 Stitching	cat2dog	FID	26.9	GNR
1 Image, 2*2 Stitching	cat2dog	DFID	53.6	StarGANv2
1 Image, 2*2 Stitching	cat2dog	FID	44.2	StarGANv2
1 Image, 2*2 Stitching	cat2dog	DFID	160.1	DRIT++
1 Image, 2*2 Stitching	cat2dog	FID	91.5	DRIT++
1 Image, 2*2 Stitching	cat2dog	DFID	172.5	CouncilGAN
1 Image, 2*2 Stitching	cat2dog	FID	90.8	CouncilGAN
1 Image, 2*2 Stitching	selfie2anime	DFID	35.6	GNR
1 Image, 2*2 Stitching	selfie2anime	FID	34.4	GNR
1 Image, 2*2 Stitching	selfie2anime	LPIPS	0.505	GNR
1 Image, 2*2 Stitching	selfie2anime	DFID	56.2	CouncilGAN
1 Image, 2*2 Stitching	selfie2anime	FID	38.1	CouncilGAN
1 Image, 2*2 Stitching	selfie2anime	LPIPS	0.43	CouncilGAN
1 Image, 2*2 Stitching	selfie2anime	DFID	83	StarGANv2
1 Image, 2*2 Stitching	selfie2anime	FID	59.8	StarGANv2
1 Image, 2*2 Stitching	selfie2anime	LPIPS	0.427	StarGANv2
1 Image, 2*2 Stitching	selfie2anime	DFID	94.6	DRIT++
1 Image, 2*2 Stitching	selfie2anime	FID	63.8	DRIT++
1 Image, 2*2 Stitching	selfie2anime	LPIPS	0.201	DRIT++

GANs N' Roses: Stable, Controllable, Diverse Image to Image Translation (works for videos too!)

Abstract

Results

Related Papers

GANs N' Roses: Stable, Controllable, Diverse Image to Image Translation (works for videos too!)

Abstract

Results

Related Papers