TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Adversarial Video Generation on Complex Datasets

Adversarial Video Generation on Complex Datasets

Aidan Clark, Jeff Donahue, Karen Simonyan

2019-07-15Video PredictionVideo Generation
PaperPDFCode

Abstract

Generative models of natural images have progressed towards high fidelity samples by the strong leveraging of scale. We attempt to carry this success to the field of video modeling by showing that large Generative Adversarial Networks trained on the complex Kinetics-600 dataset are able to produce video samples of substantially higher complexity and fidelity than previous work. Our proposed model, Dual Video Discriminator GAN (DVD-GAN), scales to longer and higher resolution videos by leveraging a computationally efficient decomposition of its discriminator. We evaluate on the related tasks of video synthesis and video prediction, and achieve new state-of-the-art Fr\'echet Inception Distance for prediction for Kinetics-600, as well as state-of-the-art Inception Score for synthesis on the UCF-101 dataset, alongside establishing a strong baseline for synthesis on Kinetics-600.

Results

TaskDatasetMetricValueModel
VideoKinetics-600 48 frames, 64x64FID12.92DVD-GAN
VideoKinetics-600 48 frames, 64x64Inception Score219.05DVD-GAN
VideoKinetics-600 12 frames, 64x64FVD31.1DVD-GAN
VideoBAIR Robot PushingCond1DVD-GAN-FP
VideoBAIR Robot PushingFVD score109.8DVD-GAN-FP
VideoBAIR Robot PushingPred15DVD-GAN-FP
VideoBAIR Robot PushingTrain15DVD-GAN-FP
VideoKinetics-600 12 frames, 128x128FID2.16DVD-GAN
VideoKinetics-600 12 frames, 64x64Cond5DVD-GAN-FP
VideoKinetics-600 12 frames, 64x64Pred11DVD-GAN-FP
VideoBAIR Robot PushingFVD109.8DVD-GAN-FP
Video PredictionKinetics-600 12 frames, 64x64Cond5DVD-GAN-FP
Video PredictionKinetics-600 12 frames, 64x64Pred11DVD-GAN-FP
Video PredictionBAIR Robot PushingFVD109.8DVD-GAN-FP
Video GenerationKinetics-600 48 frames, 64x64FID12.92DVD-GAN
Video GenerationKinetics-600 48 frames, 64x64Inception Score219.05DVD-GAN
Video GenerationKinetics-600 12 frames, 64x64FVD31.1DVD-GAN
Video GenerationBAIR Robot PushingCond1DVD-GAN-FP
Video GenerationBAIR Robot PushingFVD score109.8DVD-GAN-FP
Video GenerationBAIR Robot PushingPred15DVD-GAN-FP
Video GenerationBAIR Robot PushingTrain15DVD-GAN-FP
Video GenerationKinetics-600 12 frames, 128x128FID2.16DVD-GAN

Related Papers

World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17Leveraging Pre-Trained Visual Models for AI-Generated Video Detection2025-07-17Taming Diffusion Transformer for Real-Time Mobile Video Generation2025-07-17LoViC: Efficient Long Video Generation with Context Compression2025-07-17$I^{2}$-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting2025-07-12Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective2025-07-11Scaling RL to Long Videos2025-07-10Martian World Models: Controllable Video Synthesis with Physically Accurate 3D Reconstructions2025-07-10