Taesung Park, Ming-Yu Liu, Ting-Chun Wang, Jun-Yan Zhu
We propose spatially-adaptive normalization, a simple but effective layer for synthesizing photorealistic images given an input semantic layout. Previous methods directly feed the semantic layout as input to the deep network, which is then processed through stacks of convolution, normalization, and nonlinearity layers. We show that this is suboptimal as the normalization layers tend to ``wash away'' semantic information. To address the issue, we propose using the input layout for modulating the activations in normalization layers through a spatially-adaptive, learned transformation. Experiments on several challenging datasets demonstrate the advantage of the proposed method over existing approaches, regarding both visual fidelity and alignment with input layouts. Finally, our model allows user control over both semantic and style. Code is available at https://github.com/NVlabs/SPADE .
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image-to-Image Translation | COCO-Stuff Labels-to-Photos | FID | 22.6 | SPADE |
| Image-to-Image Translation | COCO-Stuff Labels-to-Photos | mIoU | 37.4 | SPADE |
| Image-to-Image Translation | Cityscapes Labels-to-Photo | FID | 71.8 | SPADE |
| Image-to-Image Translation | Cityscapes Labels-to-Photo | mIoU | 62.3 | SPADE |
| Image-to-Image Translation | ADE20K Labels-to-Photos | FID | 33.9 | SPADE |
| Image-to-Image Translation | ADE20K Labels-to-Photos | mIoU | 38.5 | SPADE |
| Image-to-Image Translation | ADE20K-Outdoor Labels-to-Photos | FID | 63.3 | SPADE |
| Image-to-Image Translation | ADE20K-Outdoor Labels-to-Photos | mIoU | 30.8 | SPADE |
| Image Generation | COCO-Stuff Labels-to-Photos | FID | 22.6 | SPADE |
| Image Generation | COCO-Stuff Labels-to-Photos | mIoU | 37.4 | SPADE |
| Image Generation | Cityscapes Labels-to-Photo | FID | 71.8 | SPADE |
| Image Generation | Cityscapes Labels-to-Photo | mIoU | 62.3 | SPADE |
| Image Generation | ADE20K Labels-to-Photos | FID | 33.9 | SPADE |
| Image Generation | ADE20K Labels-to-Photos | mIoU | 38.5 | SPADE |
| Image Generation | ADE20K-Outdoor Labels-to-Photos | FID | 63.3 | SPADE |
| Image Generation | ADE20K-Outdoor Labels-to-Photos | mIoU | 30.8 | SPADE |
| Sketch-to-Image Translation | COCO-Stuff | FID | 89.2 | SPADE |
| Sketch-to-Image Translation | COCO-Stuff | FID-C | 48.9 | SPADE |
| 1 Image, 2*2 Stitching | COCO-Stuff Labels-to-Photos | FID | 22.6 | SPADE |
| 1 Image, 2*2 Stitching | COCO-Stuff Labels-to-Photos | mIoU | 37.4 | SPADE |
| 1 Image, 2*2 Stitching | Cityscapes Labels-to-Photo | FID | 71.8 | SPADE |
| 1 Image, 2*2 Stitching | Cityscapes Labels-to-Photo | mIoU | 62.3 | SPADE |
| 1 Image, 2*2 Stitching | ADE20K Labels-to-Photos | FID | 33.9 | SPADE |
| 1 Image, 2*2 Stitching | ADE20K Labels-to-Photos | mIoU | 38.5 | SPADE |
| 1 Image, 2*2 Stitching | ADE20K-Outdoor Labels-to-Photos | FID | 63.3 | SPADE |
| 1 Image, 2*2 Stitching | ADE20K-Outdoor Labels-to-Photos | mIoU | 30.8 | SPADE |