TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/Show-1

Show-1

Reported on 8 benchmarks across 1 task · 1 paper · 3 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing8 results

  • Text-to-Video GenerationonEvalCrafter Text-to-Video (ECTV) Dataset
    Temporal Consistency· uses extra data· 2023-09-27
    60.83
    best: 61.46 (VideoCrafter2)
    SOTA
    Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationarXiv:2309.15818
  • Text-to-Video GenerationonEvalCrafter Text-to-Video (ECTV) Dataset
    Visual Quality· uses extra data· 2023-09-27
    53.74
    best: 54.82 (VideoCrafter2)
    SOTA
    Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationarXiv:2309.15818
  • Text-to-Video GenerationonMSR-VTT
    CLIPSIM· 2023-09-27
    0.3072
    best: 0.3125 (PixelDance)
    SOTA
    Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationarXiv:2309.15818
  • Text-to-Video GenerationonEvalCrafter Text-to-Video (ECTV) Dataset
    Motion Quality· uses extra data· 2023-09-27
    52.19
    best: 63.98 (VideoCrafter2)
    Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationarXiv:2309.15818
  • Text-to-Video GenerationonEvalCrafter Text-to-Video (ECTV) Dataset
    Text-to-Video Alignment· uses extra data· 2023-09-27
    62.07
    best: 68.49 (Lavie)
    Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationarXiv:2309.15818
  • Text-to-Video GenerationonEvalCrafter Text-to-Video (ECTV) Dataset
    Total Score· uses extra data· 2023-09-27
    229
    best: 243 (VideoCrafter2)
    Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationarXiv:2309.15818
  • Text-to-Video GenerationonMSR-VTT
    FID· 2023-09-27
    13.08
    best: 8.19 (TF-T2V)
    Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationarXiv:2309.15818
  • Text-to-Video GenerationonMSR-VTT
    FVD· 2023-09-27
    538
    best: 998 (MagicVideo)
    Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationarXiv:2309.15818