Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Audio
/
1 Image, 2*2 Stitchi
/
COCO (Common Objects in Context)
1 Image, 2*2 Stitchi on COCO (Common Objects in Context)
Metric: Inception score (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
Inception score (best first)
Inception score (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Inception score
▼
Extra Data
Paper
Date
↕
Code
1
FuseDream (k=10, 256)
34.67
No
FuseDream: Training-Free Text-to-Image Generatio...
2021-12-02
Code
2
FuseDream (few-shot, k=5)
34.26
No
FuseDream: Training-Free Text-to-Image Generatio...
2021-12-02
Code
3
FuseDream (k=5, 256)
34.26
No
FuseDream: Training-Free Text-to-Image Generatio...
2021-12-02
Code
4
DM-GAN+CL
33.34
No
Improving Text-to-Image Synthesis Using Contrast...
2021-07-06
Code
5
DM-GAN + VICTR
32.37
No
VICTR: Visual Information Captured Text Represen...
2020-10-07
Code
6
Lafite
32.34
No
LAFITE: Towards Language-Free Training for Text-...
2021-11-27
Code
7
DM-GAN (256 x 256)
32.2
No
NÜWA: Visual Synthesis Pre-training for Neural v...
2021-11-24
Code
8
Swinv2-Imagen
31.46
Yes
Swinv2-Imagen: Hierarchical Vision Transformer D...
2022-10-18
-
9
XMC-GAN (256 x 256)
30.5
No
NÜWA: Visual Synthesis Pre-training for Neural v...
2021-11-24
Code
10
DM-GAN
30.49
No
DM-GAN: Dynamic Memory Generative Adversarial Ne...
2019-04-02
Code
11
AttnGAN + VICTR
28.18
No
VICTR: Visual Information Captured Text Represen...
2020-10-07
Code
12
OP-GAN
27.88
No
Semantic Object Accuracy for Generative Text-to-...
2019-10-29
Code
13
NÜWA (256 x 256)
27.2
No
NÜWA: Visual Synthesis Pre-training for Neural v...
2021-11-24
Code
14
Lafite (zero-shot)
26.02
No
LAFITE: Towards Language-Free Training for Text-...
2021-11-27
Code
15
AttnGAN+CL
25.7
No
Improving Text-to-Image Synthesis Using Contrast...
2021-07-06
Code
16
AttnGAN + OP
24.76
No
Generating Multiple Objects at Spatially Distinc...
2019-01-03
Code
17
AttnGAN (256 x 256)
23.3
No
NÜWA: Visual Synthesis Pre-training for Neural v...
2021-11-24
Code
18
DF-GAN (256 x 256)
18.7
No
NÜWA: Visual Synthesis Pre-training for Neural v...
2021-11-24
Code
19
CogView
18.2
Yes
CogView: Mastering Text-to-Image Generation via ...
2021-05-26
Code
20
CogView (256 x 256)
18.2
No
NÜWA: Visual Synthesis Pre-training for Neural v...
2021-11-24
Code
21
DALL-E (256 x 256)
17.9
No
NÜWA: Visual Synthesis Pre-training for Neural v...
2021-11-24
Code
22
StackGAN + OP
12.12
No
Generating Multiple Objects at Spatially Distinc...
2019-01-03
Code
23
StackGAN + VICTR
10.38
No
VICTR: Visual Information Captured Text Represen...
2020-10-07
Code
24
ChatPainter
9.74
No
ChatPainter: Improving Text to Image Generation ...
2018-02-22
-
25
StackGAN-v1
8.45
No
StackGAN++: Realistic Image Synthesis with Stack...
2017-10-19
Code
#1
FuseDream (k=10, 256)
SOTA
34.67
Inception score
· 2021-12-02
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
Code
#2
FuseDream (few-shot, k=5)
34.26
Inception score
· 2021-12-02
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
Code
#3
FuseDream (k=5, 256)
34.26
Inception score
· 2021-12-02
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
Code
#4
DM-GAN+CL
SOTA
33.34
Inception score
· 2021-07-06
Improving Text-to-Image Synthesis Using Contrastive Learning
Code
#5
DM-GAN + VICTR
SOTA
32.37
Inception score
· 2020-10-07
VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
Code
#6
Lafite
32.34
Inception score
· 2021-11-27
LAFITE: Towards Language-Free Training for Text-to-Image Generation
Code
#7
DM-GAN (256 x 256)
32.2
Inception score
· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Code
#8
Swinv2-Imagen
31.46
Inception score
· Extra Data
· 2022-10-18
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation
#9
XMC-GAN (256 x 256)
30.5
Inception score
· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Code
#10
DM-GAN
SOTA
30.49
Inception score
· 2019-04-02
DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis
Code
#11
AttnGAN + VICTR
28.18
Inception score
· 2020-10-07
VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
Code
#12
OP-GAN
27.88
Inception score
· 2019-10-29
Semantic Object Accuracy for Generative Text-to-Image Synthesis
Code
#13
NÜWA (256 x 256)
27.2
Inception score
· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Code
#14
Lafite (zero-shot)
26.02
Inception score
· 2021-11-27
LAFITE: Towards Language-Free Training for Text-to-Image Generation
Code
#15
AttnGAN+CL
25.7
Inception score
· 2021-07-06
Improving Text-to-Image Synthesis Using Contrastive Learning
Code
#16
AttnGAN + OP
SOTA
24.76
Inception score
· 2019-01-03
Generating Multiple Objects at Spatially Distinct Locations
Code
#17
AttnGAN (256 x 256)
23.3
Inception score
· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Code
#18
DF-GAN (256 x 256)
18.7
Inception score
· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Code
#19
CogView
18.2
Inception score
· Extra Data
· 2021-05-26
CogView: Mastering Text-to-Image Generation via Transformers
Code
#20
CogView (256 x 256)
18.2
Inception score
· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Code
#21
DALL-E (256 x 256)
17.9
Inception score
· 2021-11-24
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Code
#22
StackGAN + OP
12.12
Inception score
· 2019-01-03
Generating Multiple Objects at Spatially Distinct Locations
Code
#23
StackGAN + VICTR
10.38
Inception score
· 2020-10-07
VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks
Code
#24
ChatPainter
SOTA
9.74
Inception score
· 2018-02-22
ChatPainter: Improving Text to Image Generation using Dialogue
#25
StackGAN-v1
SOTA
8.45
Inception score
· 2017-10-19
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
Code