Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
We present Polarity Sampling, a theoretically justified plug-and-play method for controlling the generation quality and diversity of pre-trained deep generative networks DGNs). Leveraging the fact that DGNs are, or can be approximated by, continuous piecewise affine splines, we derive the analytical DGN output space distribution as a function of the product of the DGN's Jacobian singular values raised to a power $\rho$. We dub $\rho$ the $\textbf{polarity}$ parameter and prove that $\rho$ focuses the DGN sampling on the modes ($\rho < 0$) or anti-modes ($\rho > 0$) of the DGN output-space distribution. We demonstrate that nonzero polarity values achieve a better precision-recall (quality-diversity) Pareto frontier than standard methods, such as truncation, for a number of state-of-the-art DGNs. We also present quantitative and qualitative results on the improvement of overall generation quality (e.g., in terms of the Frechet Inception Distance) for a number of state-of-the-art DGNs, including StyleGAN3, BigGAN-deep, NVAE, for different conditional and unconditional image generation tasks. In particular, Polarity Sampling redefines the state-of-the-art for StyleGAN2 on the FFHQ Dataset to FID 2.57, StyleGAN2 on the LSUN Car Dataset to FID 2.27 and StyleGAN3 on the AFHQv2 Dataset to FID 3.95. Demo: bit.ly/polarity-samp
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Generation | LSUN Cat 256 x 256 | FID | 6.34 | Polarity-StyleGAN2 |
| Image Generation | AFHQV2 | FID | 3.95 | Polarity-StyleGAN3 |
| Image Generation | CelebA-HQ 1024x1024 | FID | 7.28 | Polarity-ProGAN |
| Image Generation | FFHQ 1024 x 1024 | FID | 2.57 | Polarity-StyleGAN2 |
| Image Generation | LSUN Car 512 x 384 | FID | 2.27 | Polarity-StyleGAN2 |
| Image Generation | LSUN Churches 256 x 256 | FID | 3.92 | Polarity-StyleGAN2 |
| Image Generation | ImageNet 256x256 | FID | 6.82 | Polarity-BigGAN |