High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro

2017-11-30CVPR 2018 6Vocal Bursts Intensity Prediction Sketch-to-Image Translation Semantic Segmentation Instance Segmentation Image Generation Conditional Image Generation Image-to-Image Translation

Paper PDF Code Code Code Code Code Code Code Code Code Code(official)Code Code Code Code Code Code Code Code Code Code Code

Abstract

We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs). Conditional GANs have enabled a variety of applications, but the results are often limited to low-resolution and still far from realistic. In this work, we generate 2048x1024 visually appealing results with a novel adversarial loss, as well as new multi-scale generator and discriminator architectures. Furthermore, we extend our framework to interactive visual manipulation with two additional features. First, we incorporate object instance segmentation information, which enables object manipulations such as removing/adding objects and changing the object category. Second, we propose a method to generate diverse results given the same input, allowing users to edit the object appearance interactively. Human opinion studies demonstrate that our method significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.

Results

Task	Dataset	Metric	Value	Model
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	FID	111.5	pix2pixHD
Image-to-Image Translation	COCO-Stuff Labels-to-Photos	mIoU	14.6	pix2pixHD
Image-to-Image Translation	Cityscapes Labels-to-Photo	FID	95	pix2pixHD
Image-to-Image Translation	Cityscapes Labels-to-Photo	mIoU	58.3	pix2pixHD
Image-to-Image Translation	ADE20K Labels-to-Photos	FID	81.8	pix2pixHD
Image-to-Image Translation	ADE20K Labels-to-Photos	mIoU	20.3	pix2pixHD
Image-to-Image Translation	ADE20K-Outdoor Labels-to-Photos	FID	97.8	pix2pixHD
Image-to-Image Translation	ADE20K-Outdoor Labels-to-Photos	mIoU	17.4	pix2pixHD
Image-to-Image Translation	Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients	FID	42.8	pix2pixHD
Image-to-Image Translation	Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients	Kernel Inception Distance	0.00258	pix2pixHD
Image Generation	COCO-Stuff Labels-to-Photos	FID	111.5	pix2pixHD
Image Generation	COCO-Stuff Labels-to-Photos	mIoU	14.6	pix2pixHD
Image Generation	Cityscapes Labels-to-Photo	FID	95	pix2pixHD
Image Generation	Cityscapes Labels-to-Photo	mIoU	58.3	pix2pixHD
Image Generation	ADE20K Labels-to-Photos	FID	81.8	pix2pixHD
Image Generation	ADE20K Labels-to-Photos	mIoU	20.3	pix2pixHD
Image Generation	ADE20K-Outdoor Labels-to-Photos	FID	97.8	pix2pixHD
Image Generation	ADE20K-Outdoor Labels-to-Photos	mIoU	17.4	pix2pixHD
Image Generation	Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients	FID	42.8	pix2pixHD
Image Generation	Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients	Kernel Inception Distance	0.00258	pix2pixHD
Sketch-to-Image Translation	COCO-Stuff	FID	38.7	Pix2PixHD
Sketch-to-Image Translation	COCO-Stuff	FID-C	27.1	Pix2PixHD
1 Image, 2*2 Stitching	COCO-Stuff Labels-to-Photos	FID	111.5	pix2pixHD
1 Image, 2*2 Stitching	COCO-Stuff Labels-to-Photos	mIoU	14.6	pix2pixHD
1 Image, 2*2 Stitching	Cityscapes Labels-to-Photo	FID	95	pix2pixHD
1 Image, 2*2 Stitching	Cityscapes Labels-to-Photo	mIoU	58.3	pix2pixHD
1 Image, 2*2 Stitching	ADE20K Labels-to-Photos	FID	81.8	pix2pixHD
1 Image, 2*2 Stitching	ADE20K Labels-to-Photos	mIoU	20.3	pix2pixHD
1 Image, 2*2 Stitching	ADE20K-Outdoor Labels-to-Photos	FID	97.8	pix2pixHD
1 Image, 2*2 Stitching	ADE20K-Outdoor Labels-to-Photos	mIoU	17.4	pix2pixHD
1 Image, 2*2 Stitching	Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients	FID	42.8	pix2pixHD
1 Image, 2*2 Stitching	Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients	Kernel Inception Distance	0.00258	pix2pixHD

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

Abstract

Results

Related Papers

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

Abstract

Results

Related Papers