Image Generation on ImageNet 64x64

Metric: Bits per dim (higher is better)

LeaderboardDataset

Loading chart...

Results

#	Model↕	Bits per dim▼	Extra Data	Paper	Date↕	Code
1	Logsparse (6 layers)	4.351	No	Enhancing the Locality and Breaking the Memory B...	2019-06-29	Code
2	Axial Transformer (6 layers)	4.032	No	Axial Attention in Multidimensional Transformers	2019-12-20	Code
3	Glow (Kingma and Dhariwal, 2018)	3.81	No	Glow: Generative Flow with Invertible 1x1 Convol...	2018-07-09	Code
4	Residual Flow	3.757	No	Residual Flows for Invertible Generative Modeling	2019-06-06	Code
5	MaCow (Unf)	3.75	No	MaCow: Masked Convolutional Generative Flow	2019-02-12	Code
6	Reformer (6 layers)	3.74	No	Reformer: The Efficient Transformer	2020-01-13	Code
7	Performer (6 layers)	3.719	No	Rethinking Attention with Performers	2020-09-30	Code
8	MALI	3.71	No	MALI: A memory efficient and reverse accurate in...	2021-02-09	Code
9	Reformer (12 layers)	3.71	No	Reformer: The Efficient Transformer	2020-01-13	Code
10	Parallel Multiscale	3.7	No	Parallel Multiscale Autoregressive Density Estim...	2017-03-10	-
11	Flow++	3.69	No	Flow++: Improving Flow-Based Generative Models w...	2019-02-01	Code
12	MaCow (Var)	3.69	No	MaCow: Masked Convolutional Generative Flow	2019-02-12	Code
13	Performer (12 layers)	3.636	No	Rethinking Attention with Performers	2020-09-30	Code
14	PixelCNN	3.57	No	PixelCNN Models with Auxiliary Variables for Nat...	2016-12-24	-
15	Gated PixelCNN (van den Oord et al., [2016c])	3.57	No	Conditional Image Generation with PixelCNN Decod...	2016-06-16	Code
16	Improved DDPM	3.53	No	Improved Denoising Diffusion Probabilistic Models	2021-02-18	Code
17	SPN	3.52	No	Generating High Fidelity Images with Subscale Pi...	2018-12-04	-
18	Very Deep VAE	3.52	No	Very Deep VAEs Generalize Autoregressive Models ...	2020-11-20	Code
19	Combiner-Mixture	3.504	No	Combiner: Full Attention Transformer with Sparse...	2021-07-12	Code
20	Sparse Transformer 59M (strided)	3.44	No	Generating Long Sequences with Sparse Transformers	2019-04-23	Code
21	MRCNF	3.44	No	Multi-Resolution Continuous Normalizing Flows	2021-06-15	Code
22	Hourglass	3.44	No	Hierarchical Transformers Are More Efficient Lan...	2021-10-26	Code
23	Routing Transformer	3.43	No	Efficient Content-Based Sparse Attention with Ro...	2020-03-12	Code
24	Combiner-Axial	3.42	No	Combiner: Full Attention Transformer with Sparse...	2021-07-12	Code
25	VDM	3.4	No	Variational Diffusion Models	2021-07-01	Code
26	NDM	3.35	No	Neural Diffusion Models	2023-10-12	-
27	FM	3.31	No	Flow Matching for Generative Modeling	2022-10-06	Code
28	BSI	3.22	No	Generative Modeling with Bayesian Sample Inference	2025-02-11	Code
29	NFDM	3.2	No	Neural Flow Diffusion Models: Learnable Forward ...	2024-04-19	Code
30	TarFlow	2.99	No	Normalizing Flows are Capable Generative Models	2024-12-09	Code