Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Text-to-Video Generation
/
MSR-VTT
Text-to-Video Generation on MSR-VTT
Metric: CLIPSIM (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
#
Model
↕
CLIPSIM
▼
Extra Data
Paper
Date
↕
Code
1
PixelDance
0.3125
No
Make Pixels Dance: High-Dynamic Video Generation
2023-11-18
-
2
VideoPoet
0.3123
No
VideoPoet: A Large Language Model for Zero-Shot ...
2023-12-21
-
3
Show-1
0.3072
No
Show-1: Marrying Pixel and Latent Diffusion Mode...
2023-09-27
Code
4
Make-A-Video
0.3049
No
Make-A-Video: Text-to-Video Generation without T...
2022-09-29
Code
5
Video-LaVIT
0.3012
No
Video-LaVIT: Unified Video-Language Pre-training...
2024-02-05
Code
6
TF-T2V
0.2991
No
A Recipe for Scaling up Text-to-Video Generation...
2023-12-25
Code
7
HiGen
0.2947
No
Hierarchical Spatio-temporal Decoupling for Text...
2023-12-07
Code
8
VideoComposer
0.2932
No
VideoComposer: Compositional Video Synthesis wit...
2023-06-03
Code
9
ModelScopeT2V
0.293
No
ModelScope Text-to-Video Technical Report
2023-08-12
Code
10
Video LDM
0.2929
No
Align your Latents: High-Resolution Video Synthe...
2023-04-18
Code
11
Snap Video (512x288)
0.2793
No
Snap Video: Scaled Spatiotemporal Transformers f...
2024-02-22
-
12
Snap Video (288×288)
0.2793
No
Snap Video: Scaled Spatiotemporal Transformers f...
2024-02-22
-
13
MMVG
0.2644
No
Tell Me What Happened: Unifying Text-guided Vide...
2022-11-23
Code
14
CogVideo (English)
0.2631
No
Make-A-Video: Text-to-Video Generation without T...
2022-09-29
Code
15
CogVideo (Chinese)
0.2614
No
Align your Latents: High-Resolution Video Synthe...
2023-04-18
Code
16
NUWA
0.2439
No
NÜWA: Visual Synthesis Pre-training for Neural v...
2021-11-24
Code
17
GODIVA
0.2402
No
GODIVA: Generating Open-DomaIn Videos from nAtur...
2021-04-30
Code