Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Medical
/
Image Generation
/
WISE
Image Generation on WISE
Metric: Cultural (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
Cultural (best first)
Cultural (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Cultural
▼
Extra Data
Paper
Date
↕
Code
1
Bagel (w/ cot)
0.76
No
Emerging Properties in Unified Multimodal Pretra...
2025-05-20
Code
2
MindOmni (w/ cot)
0.75
No
MindOmni: Unleashing Reasoning Generation in Vis...
2025-05-19
Code
3
MetaQuery-XL
0.56
No
Transfer between Modalities with MetaQueries
2025-04-08
-
4
UniWorld-V1
0.53
No
UniWorld-V1: High-Resolution Semantic Encoders f...
2025-06-03
Code
5
Playground-v2.5-1024px-aesthetic
0.49
No
Playground v2.5: Three Insights towards Enhancin...
2024-02-27
-
6
PixArt-XL-2-1024-MS
0.45
No
PixArt-$α$: Fast Training of Diffusion Transform...
2023-09-30
Code
7
Bagel
0.44
No
Emerging Properties in Unified Multimodal Pretra...
2025-05-20
Code
8
stable-diffusion-3.5-large
0.44
No
Scaling Rectified Flow Transformers for High-Res...
2024-03-05
Code
9
stable-diffusion-xl-base-0.9
0.43
No
SDXL: Improving Latent Diffusion Models for High...
2023-07-04
Code
10
MindOmni (w/o cot)
0.4
No
MindOmni: Unleashing Reasoning Generation in Vis...
2025-05-19
Code
11
Emu3-gen
0.34
No
Emu3: Next-Token Prediction is All You Need
2024-09-27
Code
12
Janus-pro
0.3
No
Janus-Pro: Unified Multimodal Understanding and ...
2025-01-29
Code
13
Show-o
0.28
No
Show-o: One Single Transformer to Unify Multimod...
2024-08-22
Code
14
Janus
0.16
No
Janus: Decoupling Visual Encoding for Unified Mu...
2024-10-17
Code
#1
Bagel (w/ cot)
SOTA
0.76
Cultural
· 2025-05-20
Emerging Properties in Unified Multimodal Pretraining
Code
#2
MindOmni (w/ cot)
SOTA
0.75
Cultural
· 2025-05-19
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Code
#3
MetaQuery-XL
SOTA
0.56
Cultural
· 2025-04-08
Transfer between Modalities with MetaQueries
#4
UniWorld-V1
0.53
Cultural
· 2025-06-03
UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
Code
#5
Playground-v2.5-1024px-aesthetic
SOTA
0.49
Cultural
· 2024-02-27
Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generation
#6
PixArt-XL-2-1024-MS
SOTA
0.45
Cultural
· 2023-09-30
PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Code
#7
Bagel
0.44
Cultural
· 2025-05-20
Emerging Properties in Unified Multimodal Pretraining
Code
#8
stable-diffusion-3.5-large
0.44
Cultural
· 2024-03-05
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Code
#9
stable-diffusion-xl-base-0.9
SOTA
0.43
Cultural
· 2023-07-04
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Code
#10
MindOmni (w/o cot)
0.4
Cultural
· 2025-05-19
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Code
#11
Emu3-gen
0.34
Cultural
· 2024-09-27
Emu3: Next-Token Prediction is All You Need
Code
#12
Janus-pro
0.3
Cultural
· 2025-01-29
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling
Code
#13
Show-o
0.28
Cultural
· 2024-08-22
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Code
#14
Janus
0.16
Cultural
· 2024-10-17
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
Code