TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Text-to-Video Generation/MSR-VTT

Text-to-Video Generation on MSR-VTT

Metric: CLIPSIM (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕CLIPSIM▼Extra DataPaperDate↕Code
1PixelDance0.3125NoMake Pixels Dance: High-Dynamic Video Generation2023-11-18-
2VideoPoet0.3123NoVideoPoet: A Large Language Model for Zero-Shot ...2023-12-21-
3Show-10.3072NoShow-1: Marrying Pixel and Latent Diffusion Mode...2023-09-27Code
4Make-A-Video0.3049NoMake-A-Video: Text-to-Video Generation without T...2022-09-29Code
5Video-LaVIT0.3012NoVideo-LaVIT: Unified Video-Language Pre-training...2024-02-05Code
6TF-T2V0.2991NoA Recipe for Scaling up Text-to-Video Generation...2023-12-25Code
7HiGen0.2947NoHierarchical Spatio-temporal Decoupling for Text...2023-12-07Code
8VideoComposer0.2932NoVideoComposer: Compositional Video Synthesis wit...2023-06-03Code
9ModelScopeT2V0.293NoModelScope Text-to-Video Technical Report2023-08-12Code
10Video LDM0.2929NoAlign your Latents: High-Resolution Video Synthe...2023-04-18Code
11Snap Video (512x288)0.2793NoSnap Video: Scaled Spatiotemporal Transformers f...2024-02-22-
12Snap Video (288×288)0.2793NoSnap Video: Scaled Spatiotemporal Transformers f...2024-02-22-
13MMVG0.2644NoTell Me What Happened: Unifying Text-guided Vide...2022-11-23Code
14CogVideo (English)0.2631NoMake-A-Video: Text-to-Video Generation without T...2022-09-29Code
15CogVideo (Chinese)0.2614NoAlign your Latents: High-Resolution Video Synthe...2023-04-18Code
16NUWA0.2439NoNÜWA: Visual Synthesis Pre-training for Neural v...2021-11-24Code
17GODIVA0.2402NoGODIVA: Generating Open-DomaIn Videos from nAtur...2021-04-30Code