Jonathan Ho, Ajay Jain, Pieter Abbeel
We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and our models naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding. On the unconditional CIFAR10 dataset, we obtain an Inception score of 9.46 and a state-of-the-art FID score of 3.17. On 256x256 LSUN, we obtain sample quality similar to ProgressiveGAN. Our implementation is available at https://github.com/hojonathanho/diffusion
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Image Generation | LSUN Cat 256 x 256 | FID | 19.75 | Denoising Diffusion Probabilistic Model |
| Image Generation | ImageNet 32x32 | FID | 16.18 | DDPM |
| Image Generation | ImageNet 32x32 | bpd | 3.89 | DDPM |
| Image Generation | LSUN Bedroom 256 x 256 | FID | 4.9 | Denoising Diffusion Probabilistic Model (large) |
| Image Generation | LSUN Bedroom 256 x 256 | FID | 6.36 | Denoising Diffusion Probabilistic Model |
| Image Generation | LSUN Bedroom 256 x 256 | FD | 229.76 | Denoising Diffusion Probabilistic Model (large, DINOv2) |
| Image Generation | LSUN Bedroom 256 x 256 | Precision | 0.79 | Denoising Diffusion Probabilistic Model (large, DINOv2) |
| Image Generation | LSUN Bedroom 256 x 256 | Recall | 0.61 | Denoising Diffusion Probabilistic Model (large, DINOv2) |
| Image Generation | LSUN Bedroom | FID-50k | 4.9 | Denoising Diffusion Probabilistic Model |
| Image Generation | CIFAR-10 | FID | 3.17 | Denoising Diffusion |
| Image Generation | LSUN Churches 256 x 256 | FID | 7.89 | Denoising Diffusion Probabilistic Model |
| Density Estimation | CIFAR-10 | NLL (bits/dim) | 3.69 | DDPM |