Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro
We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs). Conditional GANs have enabled a variety of applications, but the results are often limited to low-resolution and still far from realistic. In this work, we generate 2048x1024 visually appealing results with a novel adversarial loss, as well as new multi-scale generator and discriminator architectures. Furthermore, we extend our framework to interactive visual manipulation with two additional features. First, we incorporate object instance segmentation information, which enables object manipulations such as removing/adding objects and changing the object category. Second, we propose a method to generate diverse results given the same input, allowing users to edit the object appearance interactively. Human opinion studies demonstrate that our method significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image-to-Image Translation | COCO-Stuff Labels-to-Photos | FID | 111.5 | pix2pixHD |
| Image-to-Image Translation | COCO-Stuff Labels-to-Photos | mIoU | 14.6 | pix2pixHD |
| Image-to-Image Translation | Cityscapes Labels-to-Photo | FID | 95 | pix2pixHD |
| Image-to-Image Translation | Cityscapes Labels-to-Photo | mIoU | 58.3 | pix2pixHD |
| Image-to-Image Translation | ADE20K Labels-to-Photos | FID | 81.8 | pix2pixHD |
| Image-to-Image Translation | ADE20K Labels-to-Photos | mIoU | 20.3 | pix2pixHD |
| Image-to-Image Translation | ADE20K-Outdoor Labels-to-Photos | FID | 97.8 | pix2pixHD |
| Image-to-Image Translation | ADE20K-Outdoor Labels-to-Photos | mIoU | 17.4 | pix2pixHD |
| Image-to-Image Translation | Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients | FID | 42.8 | pix2pixHD |
| Image-to-Image Translation | Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients | Kernel Inception Distance | 0.00258 | pix2pixHD |
| Image Generation | COCO-Stuff Labels-to-Photos | FID | 111.5 | pix2pixHD |
| Image Generation | COCO-Stuff Labels-to-Photos | mIoU | 14.6 | pix2pixHD |
| Image Generation | Cityscapes Labels-to-Photo | FID | 95 | pix2pixHD |
| Image Generation | Cityscapes Labels-to-Photo | mIoU | 58.3 | pix2pixHD |
| Image Generation | ADE20K Labels-to-Photos | FID | 81.8 | pix2pixHD |
| Image Generation | ADE20K Labels-to-Photos | mIoU | 20.3 | pix2pixHD |
| Image Generation | ADE20K-Outdoor Labels-to-Photos | FID | 97.8 | pix2pixHD |
| Image Generation | ADE20K-Outdoor Labels-to-Photos | mIoU | 17.4 | pix2pixHD |
| Image Generation | Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients | FID | 42.8 | pix2pixHD |
| Image Generation | Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients | Kernel Inception Distance | 0.00258 | pix2pixHD |
| Sketch-to-Image Translation | COCO-Stuff | FID | 38.7 | Pix2PixHD |
| Sketch-to-Image Translation | COCO-Stuff | FID-C | 27.1 | Pix2PixHD |
| 1 Image, 2*2 Stitching | COCO-Stuff Labels-to-Photos | FID | 111.5 | pix2pixHD |
| 1 Image, 2*2 Stitching | COCO-Stuff Labels-to-Photos | mIoU | 14.6 | pix2pixHD |
| 1 Image, 2*2 Stitching | Cityscapes Labels-to-Photo | FID | 95 | pix2pixHD |
| 1 Image, 2*2 Stitching | Cityscapes Labels-to-Photo | mIoU | 58.3 | pix2pixHD |
| 1 Image, 2*2 Stitching | ADE20K Labels-to-Photos | FID | 81.8 | pix2pixHD |
| 1 Image, 2*2 Stitching | ADE20K Labels-to-Photos | mIoU | 20.3 | pix2pixHD |
| 1 Image, 2*2 Stitching | ADE20K-Outdoor Labels-to-Photos | FID | 97.8 | pix2pixHD |
| 1 Image, 2*2 Stitching | ADE20K-Outdoor Labels-to-Photos | mIoU | 17.4 | pix2pixHD |
| 1 Image, 2*2 Stitching | Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients | FID | 42.8 | pix2pixHD |
| 1 Image, 2*2 Stitching | Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients | Kernel Inception Distance | 0.00258 | pix2pixHD |