Arantxa Casanova, Marlène Careil, Jakob Verbeek, Michal Drozdzal, Adriana Romero-Soriano
Generative Adversarial Networks (GANs) can generate near photo realistic images in narrow domains such as human faces. Yet, modeling complex distributions of datasets such as ImageNet and COCO-Stuff remains challenging in unconditional settings. In this paper, we take inspiration from kernel density estimation techniques and introduce a non-parametric approach to modeling distributions of complex datasets. We partition the data manifold into a mixture of overlapping neighborhoods described by a datapoint and its nearest neighbors, and introduce a model, called instance-conditioned GAN (IC-GAN), which learns the distribution around each datapoint. Experimental results on ImageNet and COCO-Stuff show that IC-GAN significantly improves over unconditional models and unsupervised data partitioning baselines. Moreover, we show that IC-GAN can effortlessly transfer to datasets not seen during training by simply changing the conditioning instances, and still generate realistic images. Finally, we extend IC-GAN to the class-conditional case and show semantically controllable generation and competitive quantitative results on ImageNet; while improving over BigGAN on ImageNet-LT. Code and trained models to reproduce the reported results are available at https://github.com/facebookresearch/ic_gan.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Generation | ImageNet 256x256 | FID | 8.1 | BigGAN+ [Brock et al.] (chx96) |
| Image Generation | ImageNet 256x256 | Inception score | 144.2 | BigGAN+ [Brock et al.] (chx96) |
| Image Generation | ImageNet 64x64 | FID | 6.7 | IC-GAN + DA |
| Image Generation | ImageNet 128x128 | FID | 9.5 | IC-GAN + DA |
| Image Generation | ImageNet 128x128 | Inception score | 108.6 | IC-GAN + DA |
| Conditional Image Generation | ImageNet 256x256 | FID | 8.1 | BigGAN+ [Brock et al.] (chx96) |
| Conditional Image Generation | ImageNet 256x256 | Inception score | 144.2 | BigGAN+ [Brock et al.] (chx96) |
| Conditional Image Generation | ImageNet 64x64 | FID | 6.7 | IC-GAN + DA |
| Conditional Image Generation | ImageNet 128x128 | FID | 9.5 | IC-GAN + DA |
| Conditional Image Generation | ImageNet 128x128 | Inception score | 108.6 | IC-GAN + DA |