Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Medical
/
Image Generation
/
WISE
Image Generation on WISE
Metric: Overall (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
Overall (best first)
Overall (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Overall
▼
Extra Data
Paper
Date
↕
Code
1
MindOmni (w/ cot)
0.71
No
MindOmni: Unleashing Reasoning Generation in Vis...
2025-05-19
Code
2
Bagel (w/ cot)
0.7
No
Emerging Properties in Unified Multimodal Pretra...
2025-05-20
Code
3
MetaQuery-XL
0.55
No
Transfer between Modalities with MetaQueries
2025-04-08
-
4
UniWorld-V1
0.55
No
UniWorld-V1: High-Resolution Semantic Encoders f...
2025-06-03
Code
5
Bagel
0.52
No
Emerging Properties in Unified Multimodal Pretra...
2025-05-20
Code
6
Playground-v2.5-1024px-aesthetic
0.49
No
Playground v2.5: Three Insights towards Enhancin...
2024-02-27
-
7
PixArt-XL-2-1024-MS
0.47
No
PixArt-$α$: Fast Training of Diffusion Transform...
2023-09-30
Code
8
stable-diffusion-3.5-large
0.46
No
Scaling Rectified Flow Transformers for High-Res...
2024-03-05
Code
9
stable-diffusion-xl-base-0.9
0.43
No
SDXL: Improving Latent Diffusion Models for High...
2023-07-04
Code
10
MindOmni (w/o cot)
0.43
No
MindOmni: Unleashing Reasoning Generation in Vis...
2025-05-19
Code
11
Emu3-gen
0.39
No
Emu3: Next-Token Prediction is All You Need
2024-09-27
Code
12
Janus-pro
0.35
No
Janus-Pro: Unified Multimodal Understanding and ...
2025-01-29
Code
13
Show-o
0.35
No
Show-o: One Single Transformer to Unify Multimod...
2024-08-22
Code
14
Janus
0.23
No
Janus: Decoupling Visual Encoding for Unified Mu...
2024-10-17
Code
#1
MindOmni (w/ cot)
SOTA
0.71
Overall
· 2025-05-19
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Code
#2
Bagel (w/ cot)
0.7
Overall
· 2025-05-20
Emerging Properties in Unified Multimodal Pretraining
Code
#3
MetaQuery-XL
SOTA
0.55
Overall
· 2025-04-08
Transfer between Modalities with MetaQueries
#4
UniWorld-V1
0.55
Overall
· 2025-06-03
UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
Code
#5
Bagel
0.52
Overall
· 2025-05-20
Emerging Properties in Unified Multimodal Pretraining
Code
#6
Playground-v2.5-1024px-aesthetic
SOTA
0.49
Overall
· 2024-02-27
Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generation
#7
PixArt-XL-2-1024-MS
SOTA
0.47
Overall
· 2023-09-30
PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Code
#8
stable-diffusion-3.5-large
0.46
Overall
· 2024-03-05
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Code
#9
stable-diffusion-xl-base-0.9
SOTA
0.43
Overall
· 2023-07-04
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Code
#10
MindOmni (w/o cot)
0.43
Overall
· 2025-05-19
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Code
#11
Emu3-gen
0.39
Overall
· 2024-09-27
Emu3: Next-Token Prediction is All You Need
Code
#12
Janus-pro
0.35
Overall
· 2025-01-29
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
Code
#13
Show-o
0.35
Overall
· 2024-08-22
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Code
#14
Janus
0.23
Overall
· 2024-10-17
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
Code