TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Markov Decision Process for Video Generation

Markov Decision Process for Video Generation

Vladyslav Yushchenko, Nikita Araslanov, Stefan Roth

2019-09-26Video Generation
PaperPDF

Abstract

We identify two pathological cases of temporal inconsistencies in video generation: video freezing and video looping. To better quantify the temporal diversity, we propose a class of complementary metrics that are effective, easy to implement, data agnostic, and interpretable. Further, we observe that current state-of-the-art models are trained on video samples of fixed length thereby inhibiting long-term modeling. To address this, we reformulate the problem of video generation as a Markov Decision Process (MDP). The underlying idea is to represent motion as a stochastic process with an infinite forecast horizon to overcome the fixed length limitation and to mitigate the presence of temporal artifacts. We show that our formulation is easy to integrate into the state-of-the-art MoCoGAN framework. Our experiments on the Human Actions and UCF-101 datasets demonstrate that our MDP-based model is more memory efficient and improves the video quality both in terms of the new and established metrics.

Results

TaskDatasetMetricValueModel
VideoUCF-101 16 frames, Unconditional, Single GPUInception Score11.86MoCoGAN-MDP
VideoUCF-101 16 frames, 64x64, UnconditionalFVD1277MoCoGAN-MDP
VideoUCF-101 16 frames, 64x64, UnconditionalInception Score11.86MoCoGAN-MDP
Video GenerationUCF-101 16 frames, Unconditional, Single GPUInception Score11.86MoCoGAN-MDP
Video GenerationUCF-101 16 frames, 64x64, UnconditionalFVD1277MoCoGAN-MDP
Video GenerationUCF-101 16 frames, 64x64, UnconditionalInception Score11.86MoCoGAN-MDP

Related Papers

World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17Leveraging Pre-Trained Visual Models for AI-Generated Video Detection2025-07-17Taming Diffusion Transformer for Real-Time Mobile Video Generation2025-07-17LoViC: Efficient Long Video Generation with Context Compression2025-07-17$I^{2}$-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting2025-07-12Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective2025-07-11Scaling RL to Long Videos2025-07-10Martian World Models: Controllable Video Synthesis with Physically Accurate 3D Reconstructions2025-07-10