Image-to-Image Translation with Conditional Adversarial Networks

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros

2016-11-21CVPR 2017 7Translation Colorization Nuclear Segmentation Cross-View Image-to-Image Translation Image-to-Image Translation

Abstract

We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

Results

Task	Dataset	Metric	Value	Model
Image-to-Image Translation	Aerial-to-Map	Class IOU	0.26	cGAN
Image-to-Image Translation	FLIR	PSNR	4.19	pix2pix
Image-to-Image Translation	FLIR	SSIM	0.05	pix2pix
Image-to-Image Translation	Cityscapes Labels-to-Photo	Class IOU	0.18	pix2pix
Image-to-Image Translation	Cityscapes Labels-to-Photo	Per-class Accuracy	25	pix2pix
Image-to-Image Translation	Cityscapes Labels-to-Photo	Per-pixel Accuracy	71	pix2pix
Image-to-Image Translation	Cityscapes Photo-to-Labels	Class IOU	0.32	pix2pix
Image-to-Image Translation	Dayton (64x64) - ground-to-aerial	SSIM	0.3675	Pix2pix
Image-to-Image Translation	cvusa	SSIM	0.3923	Pix2pix
Image-to-Image Translation	Dayton (64×64) - aerial-to-ground	SSIM	0.4808	Pix2pix
Image-to-Image Translation	Ego2Top	SSIM	0.2213	Pix2pix
Image-to-Image Translation	Dayton (256×256) - ground-to-aerial	SSIM	0.2693	Pix2pix
Image-to-Image Translation	Dayton (256×256) - aerial-to-ground	SSIM	0.418	Pix2pix
Image-to-Image Translation	Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients	FID	48.6	pix2pix
Medical Image Segmentation	Cell17	Dice	0.6351	Pix2Pix
Medical Image Segmentation	Cell17	F1-score	0.6208	Pix2Pix
Medical Image Segmentation	Cell17	Hausdorff	19.1441	Pix2Pix
Image Generation	Aerial-to-Map	Class IOU	0.26	cGAN
Image Generation	FLIR	PSNR	4.19	pix2pix
Image Generation	FLIR	SSIM	0.05	pix2pix
Image Generation	Cityscapes Labels-to-Photo	Class IOU	0.18	pix2pix
Image Generation	Cityscapes Labels-to-Photo	Per-class Accuracy	25	pix2pix
Image Generation	Cityscapes Labels-to-Photo	Per-pixel Accuracy	71	pix2pix
Image Generation	Cityscapes Photo-to-Labels	Class IOU	0.32	pix2pix
Image Generation	Dayton (64x64) - ground-to-aerial	SSIM	0.3675	Pix2pix
Image Generation	cvusa	SSIM	0.3923	Pix2pix
Image Generation	Dayton (64×64) - aerial-to-ground	SSIM	0.4808	Pix2pix
Image Generation	Ego2Top	SSIM	0.2213	Pix2pix
Image Generation	Dayton (256×256) - ground-to-aerial	SSIM	0.2693	Pix2pix
Image Generation	Dayton (256×256) - aerial-to-ground	SSIM	0.418	Pix2pix
Image Generation	Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients	FID	48.6	pix2pix
Image Reconstruction	Edge-to-Handbags	FID	96.31	pix2pix
Image Reconstruction	Edge-to-Handbags	LPIPS	0.234	pix2pix
Image Reconstruction	Edge-to-Shoes	FID	197.492	pix2pix
Image Reconstruction	Edge-to-Shoes	LPIPS	0.238	pix2pix
Colorization	ImageNet val	FID-5K	24.41	cGAN
1 Image, 2*2 Stitching	Aerial-to-Map	Class IOU	0.26	cGAN
1 Image, 2*2 Stitching	FLIR	PSNR	4.19	pix2pix
1 Image, 2*2 Stitching	FLIR	SSIM	0.05	pix2pix
1 Image, 2*2 Stitching	Cityscapes Labels-to-Photo	Class IOU	0.18	pix2pix
1 Image, 2*2 Stitching	Cityscapes Labels-to-Photo	Per-class Accuracy	25	pix2pix
1 Image, 2*2 Stitching	Cityscapes Labels-to-Photo	Per-pixel Accuracy	71	pix2pix
1 Image, 2*2 Stitching	Cityscapes Photo-to-Labels	Class IOU	0.32	pix2pix
1 Image, 2*2 Stitching	Dayton (64x64) - ground-to-aerial	SSIM	0.3675	Pix2pix
1 Image, 2*2 Stitching	cvusa	SSIM	0.3923	Pix2pix
1 Image, 2*2 Stitching	Dayton (64×64) - aerial-to-ground	SSIM	0.4808	Pix2pix
1 Image, 2*2 Stitching	Ego2Top	SSIM	0.2213	Pix2pix
1 Image, 2*2 Stitching	Dayton (256×256) - ground-to-aerial	SSIM	0.2693	Pix2pix
1 Image, 2*2 Stitching	Dayton (256×256) - aerial-to-ground	SSIM	0.418	Pix2pix
1 Image, 2*2 Stitching	Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients	FID	48.6	pix2pix

Image-to-Image Translation with Conditional Adversarial Networks

Abstract

Results

Related Papers

Image-to-Image Translation with Conditional Adversarial Networks

Abstract

Results

Related Papers