TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Deep Video Generation, Prediction and Completion of Human ...

Deep Video Generation, Prediction and Completion of Human Action Sequences

Haoye Cai, Chunyan Bai, Yu-Wing Tai, Chi-Keung Tang

2017-11-23ECCV 2018 9Video PredictionPredictionHuman action generationVideo Generation
PaperPDF

Abstract

Current deep learning results on video generation are limited while there are only a few first results on video prediction and no relevant significant results on video completion. This is due to the severe ill-posedness inherent in these three problems. In this paper, we focus on human action videos, and propose a general, two-stage deep framework to generate human action videos with no constraints or arbitrary number of constraints, which uniformly address the three problems: video generation given no input frames, video prediction given the first few frames, and video completion given the first and last frames. To make the problem tractable, in the first stage we train a deep generative model that generates a human pose sequence from random noise. In the second stage, a skeleton-to-image network is trained, which is used to generate a human action video given the complete human pose sequence generated in the first stage. By introducing the two-stage strategy, we sidestep the original ill-posed problems while producing for the first time high-quality video generation/prediction/completion results of much longer duration. We present quantitative and qualitative evaluation to show that our two-stage approach outperforms state-of-the-art methods in video generation, prediction and video completion. Our video result demonstration can be viewed at https://iamacewhite.github.io/supp/index.html

Results

TaskDatasetMetricValueModel
Activity RecognitionNTU RGB+D 2DMMDa (CS)0.698SkeletonGAN
Activity RecognitionNTU RGB+D 2DMMDa (CV)0.999SkeletonGAN
Activity RecognitionNTU RGB+D 2DMMDs (CS)0.788SkeletonGAN
Activity RecognitionNTU RGB+D 2DMMDs (CV)1.311SkeletonGAN
Activity RecognitionHuman3.6MMMDa0.419Deep Video Generation, Prediction and Completion of Human Action Sequences
Activity RecognitionHuman3.6MMMDs0.436Deep Video Generation, Prediction and Completion of Human Action Sequences
Human action generationNTU RGB+D 2DMMDa (CS)0.698SkeletonGAN
Human action generationNTU RGB+D 2DMMDa (CV)0.999SkeletonGAN
Human action generationNTU RGB+D 2DMMDs (CS)0.788SkeletonGAN
Human action generationNTU RGB+D 2DMMDs (CV)1.311SkeletonGAN
Human action generationHuman3.6MMMDa0.419Deep Video Generation, Prediction and Completion of Human Action Sequences
Human action generationHuman3.6MMMDs0.436Deep Video Generation, Prediction and Completion of Human Action Sequences

Related Papers

Multi-Strategy Improved Snake Optimizer Accelerated CNN-LSTM-Attention-Adaboost for Trajectory Prediction2025-07-21World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17Leveraging Pre-Trained Visual Models for AI-Generated Video Detection2025-07-17Taming Diffusion Transformer for Real-Time Mobile Video Generation2025-07-17LoViC: Efficient Long Video Generation with Context Compression2025-07-17Generative Click-through Rate Prediction with Applications to Search Advertising2025-07-15$I^{2}$-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting2025-07-12Conformation-Aware Structure Prediction of Antigen-Recognizing Immune Proteins2025-07-11