TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Text-to-Video Generation/MSR-VTT

Text-to-Video Generation on MSR-VTT

Metric: FVD (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕FVD▼Extra DataPaperDate↕Code
1MagicVideo998NoMagicVideo: Efficient Video Generation With Late...2022-11-20-
2VideoComposer580NoVideoComposer: Compositional Video Synthesis wit...2023-06-03Code
3ModelScopeT2V550NoModelScope Text-to-Video Technical Report2023-08-12Code
4Show-1538NoShow-1: Marrying Pixel and Latent Diffusion Mode...2023-09-27Code
5TF-T2V441NoA Recipe for Scaling up Text-to-Video Generation...2023-12-25Code
6HiGen406NoHierarchical Spatio-temporal Decoupling for Text...2023-12-07Code
7PixelDance381NoMake Pixels Dance: High-Dynamic Video Generation2023-11-18-
8VideoPoet213NoVideoPoet: A Large Language Model for Zero-Shot ...2023-12-21-
9Video-LaVIT188.36NoVideo-LaVIT: Unified Video-Language Pre-training...2024-02-05Code
10Snap Video (288×288)110.4NoSnap Video: Scaled Spatiotemporal Transformers f...2024-02-22-
11Snap Video (512x288)104NoSnap Video: Scaled Spatiotemporal Transformers f...2024-02-22-