TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Computer Vision/Video/UCF-101

Video on UCF-101

Metric: Inception Score (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Inception Score▼Extra DataPaperDate↕Code
1HPDM-L87.68NoHierarchical Patch Diffusion Models for High-Res...2024-06-12-
2Make-A-Video (Finetuning, 256x256, class-conditional)82.55NoMake-A-Video: Text-to-Video Generation without T...2022-09-29Code
3VideoFusion (128x128, class-conditional)80.03NoVideoFusion: Decomposed Diffusion Models for Hig...2023-03-15Code
4TATS (128x128, class-conditional)79.28NoLong Video Generation with Time-Agnostic VQGAN a...2022-04-07Code
5FIFO-Diffusion74.44NoFIFO-Diffusion: Generating Infinite Videos from ...2024-05-19Code
6MMVG (128x128, class-conditional)73.7NoTell Me What Happened: Unifying Text-guided Vide...2022-11-23Code
7VideoFusion (128x128, unconditional)72.22NoVideoFusion: Decomposed Diffusion Models for Hig...2023-03-15Code
8MeBT (128x128, unconditional)65.93NoTowards End-to-End Generative Modeling of Long V...2023-03-20Code
9GridDiff (Zero-shot)62.88NoGrid Diffusion Models for Text-to-Video Generation2024-03-30-
10PYoCo (Zero-shot, 64x64, unconditional)60.01NoPreserve Your Own Correlation: A Noise Prior for...2023-05-17-
11DIGAN (128x128, class-conditional)59.68NoGenerating Videos with Dynamics-aware Implicit G...2022-02-21Code
12MMVG (128x128, unconditional)58.3NoTell Me What Happened: Unifying Text-guided Vide...2022-11-23Code
13TATS (128x128, unconditional)57.63NoLong Video Generation with Time-Agnostic VQGAN a...2022-04-07Code
14CogVideo (128x128, class-conditional)51.11NoCogVideo: Large-scale Pretraining for Text-to-Vi...2022-05-29Code
15VideoAssembler (Zero-shot, 256x256, class-conditional)48.01NoMagDiff: Multi-Alignment Diffusion for High-Fide...2023-11-29Code
16PYoCo (Zero-shot, 64x64, text-conditional)47.76NoPreserve Your Own Correlation: A Noise Prior for...2023-05-17-
17Video-LaVIT44.26NoVideo-LaVIT: Unified Video-Language Pre-training...2024-02-05Code
18PixelDance (256x256, text-conditional)42.1NoMake Pixels Dance: High-Dynamic Video Generation2023-11-18-
19VideoPoet (text-conditional)38.44NoVideoPoet: A Large Language Model for Zero-Shot ...2023-12-21-
20Lumiere (Zero-shot. 1024x1024, text-conditional)37.54NoLumiere: A Space-Time Diffusion Model for Video ...2024-01-23Code
21W.A.L.T 3B (text-conditional)35.1NoPhotorealistic Video Generation with Diffusion M...2023-12-11-
22MoCoGAN-HD (256x256, unconditional)33.95NoA Good Image Generator Is What You Need for High...2021-04-30Code
23Video LDM (320x512, text-conditional)33.45NoAlign your Latents: High-Resolution Video Synthe...2023-04-18Code
24Make-A-Video (Zero-shot, 256x256, class-conditional)33NoMake-A-Video: Text-to-Video Generation without T...2022-09-29Code
25DIGAN (128x128, unconditional)32.7NoGenerating Videos with Dynamics-aware Implicit G...2022-02-21Code