Text-to-Image Generation on COCO (Common Objects in Context)

Metric: Inception score (higher is better)

LeaderboardDataset

Loading chart...

Results

Hide extra data

Sort:

#	Model↕	Inception score▼	Extra Data	Paper	Date↕	Code
1	FuseDream (k=10, 256)	34.67	No	FuseDream: Training-Free Text-to-Image Generatio...	2021-12-02	Code
2	FuseDream (few-shot, k=5)	34.26	No	FuseDream: Training-Free Text-to-Image Generatio...	2021-12-02	Code
3	FuseDream (k=5, 256)	34.26	No	FuseDream: Training-Free Text-to-Image Generatio...	2021-12-02	Code
4	DM-GAN+CL	33.34	No	Improving Text-to-Image Synthesis Using Contrast...	2021-07-06	Code
5	DM-GAN + VICTR	32.37	No	VICTR: Visual Information Captured Text Represen...	2020-10-07	Code
6	Lafite	32.34	No	LAFITE: Towards Language-Free Training for Text-...	2021-11-27	Code
7	DM-GAN (256 x 256)	32.2	No	NÜWA: Visual Synthesis Pre-training for Neural v...	2021-11-24	Code
8	Swinv2-Imagen	31.46	Yes	Swinv2-Imagen: Hierarchical Vision Transformer D...	2022-10-18	-
9	XMC-GAN (256 x 256)	30.5	No	NÜWA: Visual Synthesis Pre-training for Neural v...	2021-11-24	Code
10	DM-GAN	30.49	No	DM-GAN: Dynamic Memory Generative Adversarial Ne...	2019-04-02	Code
11	AttnGAN + VICTR	28.18	No	VICTR: Visual Information Captured Text Represen...	2020-10-07	Code
12	OP-GAN	27.88	No	Semantic Object Accuracy for Generative Text-to-...	2019-10-29	Code
13	NÜWA (256 x 256)	27.2	No	NÜWA: Visual Synthesis Pre-training for Neural v...	2021-11-24	Code
14	Lafite (zero-shot)	26.02	No	LAFITE: Towards Language-Free Training for Text-...	2021-11-27	Code
15	AttnGAN+CL	25.7	No	Improving Text-to-Image Synthesis Using Contrast...	2021-07-06	Code
16	AttnGAN + OP	24.76	No	Generating Multiple Objects at Spatially Distinc...	2019-01-03	Code
17	AttnGAN (256 x 256)	23.3	No	NÜWA: Visual Synthesis Pre-training for Neural v...	2021-11-24	Code
18	DF-GAN (256 x 256)	18.7	No	NÜWA: Visual Synthesis Pre-training for Neural v...	2021-11-24	Code
19	CogView	18.2	Yes	CogView: Mastering Text-to-Image Generation via ...	2021-05-26	Code
20	CogView (256 x 256)	18.2	No	NÜWA: Visual Synthesis Pre-training for Neural v...	2021-11-24	Code
21	DALL-E (256 x 256)	17.9	No	NÜWA: Visual Synthesis Pre-training for Neural v...	2021-11-24	Code
22	StackGAN + OP	12.12	No	Generating Multiple Objects at Spatially Distinc...	2019-01-03	Code
23	StackGAN + VICTR	10.38	No	VICTR: Visual Information Captured Text Represen...	2020-10-07	Code
24	ChatPainter	9.74	No	ChatPainter: Improving Text to Image Generation ...	2018-02-22	-
25	StackGAN-v1	8.45	No	StackGAN++: Realistic Image Synthesis with Stack...	2017-10-19	Code

#1FuseDream (k=10, 256)SOTA
34.67
Inception score· 2021-12-02
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization Code
#2FuseDream (few-shot, k=5)
34.26
Inception score· 2021-12-02
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization Code
#3FuseDream (k=5, 256)
34.26
Inception score· 2021-12-02
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization Code
#4DM-GAN+CLSOTA
33.34
Inception score· 2021-07-06
Improving Text-to-Image Synthesis Using Contrastive Learning Code
#5DM-GAN + VICTRSOTA
32.37
Inception score· 2020-10-07
VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks Code
#6Lafite
32.34
Inception score· 2021-11-27
LAFITE: Towards Language-Free Training for Text-to-Image Generation Code
#7DM-GAN (256 x 256)
32.2
Inception score· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion Code
#8Swinv2-Imagen
31.46
Inception score· Extra Data· 2022-10-18
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation
#9XMC-GAN (256 x 256)
30.5
Inception score· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion Code
#10DM-GANSOTA
30.49
Inception score· 2019-04-02
DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis Code
#11AttnGAN + VICTR
28.18
Inception score· 2020-10-07
VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks Code
#12OP-GAN
27.88
Inception score· 2019-10-29
Semantic Object Accuracy for Generative Text-to-Image Synthesis Code
#13NÜWA (256 x 256)
27.2
Inception score· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion Code
#14Lafite (zero-shot)
26.02
Inception score· 2021-11-27
LAFITE: Towards Language-Free Training for Text-to-Image Generation Code
#15AttnGAN+CL
25.7
Inception score· 2021-07-06
Improving Text-to-Image Synthesis Using Contrastive Learning Code
#16AttnGAN + OPSOTA
24.76
Inception score· 2019-01-03
Generating Multiple Objects at Spatially Distinct Locations Code
#17AttnGAN (256 x 256)
23.3
Inception score· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion Code
#18DF-GAN (256 x 256)
18.7
Inception score· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion Code
#19CogView
18.2
Inception score· Extra Data· 2021-05-26
CogView: Mastering Text-to-Image Generation via Transformers Code
#20CogView (256 x 256)
18.2
Inception score· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion Code
#21DALL-E (256 x 256)
17.9
Inception score· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion Code
#22StackGAN + OP
12.12
Inception score· 2019-01-03
Generating Multiple Objects at Spatially Distinct Locations Code
#23StackGAN + VICTR
10.38
Inception score· 2020-10-07
VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks Code
#24ChatPainterSOTA
9.74
Inception score· 2018-02-22
ChatPainter: Improving Text to Image Generation using Dialogue
#25StackGAN-v1SOTA
8.45
Inception score· 2017-10-19
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks Code