TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Train Sparsely, Generate Densely: Memory-efficient Unsuper...

Train Sparsely, Generate Densely: Memory-efficient Unsupervised Training of High-resolution Temporal GAN

Masaki Saito, Shunta Saito, Masanori Koyama, Sosuke Kobayashi

2018-11-22Video Generation
PaperPDFCode(official)Code

Abstract

Training of Generative Adversarial Network (GAN) on a video dataset is a challenge because of the sheer size of the dataset and the complexity of each observation. In general, the computational cost of training GAN scales exponentially with the resolution. In this study, we present a novel memory efficient method of unsupervised learning of high-resolution video dataset whose computational cost scales only linearly with the resolution. We achieve this by designing the generator model as a stack of small sub-generators and training the model in a specific way. We train each sub-generator with its own specific discriminator. At the time of the training, we introduce between each pair of consecutive sub-generators an auxiliary subsampling layer that reduces the frame-rate by a certain ratio. This procedure can allow each sub-generator to learn the distribution of the video at different levels of resolution. We also need only a few GPUs to train a highly complex generator that far outperforms the predecessor in terms of inception scores.

Results

TaskDatasetMetricValueModel
VideoUCF-101 16 frames, Unconditional, Single GPUInception Score21.45TGANv2
VideoUCF-101 16 frames, 128x128, UnconditionalInception Score28.87TGANv2 (2020)
VideoUCF-101 16 frames, 128x128, UnconditionalInception Score24.34TGANv2
Video GenerationUCF-101 16 frames, Unconditional, Single GPUInception Score21.45TGANv2
Video GenerationUCF-101 16 frames, 128x128, UnconditionalInception Score28.87TGANv2 (2020)
Video GenerationUCF-101 16 frames, 128x128, UnconditionalInception Score24.34TGANv2

Related Papers

World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving2025-07-17Leveraging Pre-Trained Visual Models for AI-Generated Video Detection2025-07-17Taming Diffusion Transformer for Real-Time Mobile Video Generation2025-07-17LoViC: Efficient Long Video Generation with Context Compression2025-07-17$I^{2}$-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting2025-07-12Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective2025-07-11Scaling RL to Long Videos2025-07-10Martian World Models: Controllable Video Synthesis with Physically Accurate 3D Reconstructions2025-07-10