Adversarial Video Generation on Complex Datasets

Aidan Clark, Jeff Donahue, Karen Simonyan

2019-07-15Video Prediction Video Generation

Abstract

Generative models of natural images have progressed towards high fidelity samples by the strong leveraging of scale. We attempt to carry this success to the field of video modeling by showing that large Generative Adversarial Networks trained on the complex Kinetics-600 dataset are able to produce video samples of substantially higher complexity and fidelity than previous work. Our proposed model, Dual Video Discriminator GAN (DVD-GAN), scales to longer and higher resolution videos by leveraging a computationally efficient decomposition of its discriminator. We evaluate on the related tasks of video synthesis and video prediction, and achieve new state-of-the-art Fr\'echet Inception Distance for prediction for Kinetics-600, as well as state-of-the-art Inception Score for synthesis on the UCF-101 dataset, alongside establishing a strong baseline for synthesis on Kinetics-600.

Results

Task	Dataset	Metric	Value	Model
Video	Kinetics-600 48 frames, 64x64	FID	12.92	DVD-GAN
Video	Kinetics-600 48 frames, 64x64	Inception Score	219.05	DVD-GAN
Video	Kinetics-600 12 frames, 64x64	FVD	31.1	DVD-GAN
Video	BAIR Robot Pushing	Cond	1	DVD-GAN-FP
Video	BAIR Robot Pushing	FVD score	109.8	DVD-GAN-FP
Video	BAIR Robot Pushing	Pred	15	DVD-GAN-FP
Video	BAIR Robot Pushing	Train	15	DVD-GAN-FP
Video	Kinetics-600 12 frames, 128x128	FID	2.16	DVD-GAN
Video	Kinetics-600 12 frames, 64x64	Cond	5	DVD-GAN-FP
Video	Kinetics-600 12 frames, 64x64	Pred	11	DVD-GAN-FP
Video	BAIR Robot Pushing	FVD	109.8	DVD-GAN-FP
Video Prediction	Kinetics-600 12 frames, 64x64	Cond	5	DVD-GAN-FP
Video Prediction	Kinetics-600 12 frames, 64x64	Pred	11	DVD-GAN-FP
Video Prediction	BAIR Robot Pushing	FVD	109.8	DVD-GAN-FP
Video Generation	Kinetics-600 48 frames, 64x64	FID	12.92	DVD-GAN
Video Generation	Kinetics-600 48 frames, 64x64	Inception Score	219.05	DVD-GAN
Video Generation	Kinetics-600 12 frames, 64x64	FVD	31.1	DVD-GAN
Video Generation	BAIR Robot Pushing	Cond	1	DVD-GAN-FP
Video Generation	BAIR Robot Pushing	FVD score	109.8	DVD-GAN-FP
Video Generation	BAIR Robot Pushing	Pred	15	DVD-GAN-FP
Video Generation	BAIR Robot Pushing	Train	15	DVD-GAN-FP
Video Generation	Kinetics-600 12 frames, 128x128	FID	2.16	DVD-GAN

Adversarial Video Generation on Complex Datasets

Abstract

Results

Related Papers

Adversarial Video Generation on Complex Datasets

Abstract

Results

Related Papers