Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/Show-1

Show-1

Reported on 8 benchmarks across 1 task · 1 paper · 3 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing8 results

Text-to-Video GenerationonEvalCrafter Text-to-Video (ECTV) Dataset
Temporal Consistency· uses extra data· 2023-09-27
60.83
best: 61.46 (VideoCrafter2)
SOTA
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation arXiv:2309.15818
Text-to-Video GenerationonEvalCrafter Text-to-Video (ECTV) Dataset
Visual Quality· uses extra data· 2023-09-27
53.74
best: 54.82 (VideoCrafter2)
SOTA
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation arXiv:2309.15818
Text-to-Video GenerationonMSR-VTT
CLIPSIM· 2023-09-27
0.3072
best: 0.3125 (PixelDance)
SOTA
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation arXiv:2309.15818
Text-to-Video GenerationonEvalCrafter Text-to-Video (ECTV) Dataset
Motion Quality· uses extra data· 2023-09-27
52.19
best: 63.98 (VideoCrafter2)
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation arXiv:2309.15818
Text-to-Video GenerationonEvalCrafter Text-to-Video (ECTV) Dataset
Text-to-Video Alignment· uses extra data· 2023-09-27
62.07
best: 68.49 (Lavie)
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation arXiv:2309.15818
Text-to-Video GenerationonEvalCrafter Text-to-Video (ECTV) Dataset
Total Score· uses extra data· 2023-09-27
229
best: 243 (VideoCrafter2)
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation arXiv:2309.15818
Text-to-Video GenerationonMSR-VTT
FID· 2023-09-27
13.08
best: 8.19 (TF-T2V)
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation arXiv:2309.15818
Text-to-Video GenerationonMSR-VTT
FVD· 2023-09-27
538
best: 998 (MagicVideo)
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation arXiv:2309.15818