Improved Conditional VRNNs for Video Prediction

Lluis Castrejon, Nicolas Ballas, Aaron Courville

2019-04-27ICCV 2019 10Video Prediction Prediction Video Generation

Abstract

Predicting future frames for a video sequence is a challenging generative modeling task. Promising approaches include probabilistic latent variable models such as the Variational Auto-Encoder. While VAEs can handle uncertainty and model multiple possible future outcomes, they have a tendency to produce blurry predictions. In this work we argue that this is a sign of underfitting. To address this issue, we propose to increase the expressiveness of the latent distributions and to use higher capacity likelihood models. Our approach relies on a hierarchy of latent variables, which defines a family of flexible prior and posterior distributions in order to better model the probability of future sequences. We validate our proposal through a series of ablation experiments and compare our approach to current state-of-the-art latent variable models. Our method performs favorably under several metrics in three different datasets.

Results

Task	Dataset	Metric	Value	Model
Video	BAIR Robot Pushing	Cond	2	Hier-VRNN
Video	BAIR Robot Pushing	FVD score	143.4	Hier-VRNN
Video	BAIR Robot Pushing	Pred	28	Hier-VRNN
Video	BAIR Robot Pushing	Train	10	Hier-VRNN
Video	BAIR Robot Pushing	Cond	2	VRNN 1L
Video	BAIR Robot Pushing	FVD score	149.22	VRNN 1L
Video	BAIR Robot Pushing	Pred	28	VRNN 1L
Video	BAIR Robot Pushing	Train	10	VRNN 1L
Video	Cityscapes 128x128	Cond.	2	Hier-VRNN
Video	Cityscapes 128x128	FVD	567.51	Hier-VRNN
Video	Cityscapes 128x128	Pred	28	Hier-VRNN
Video	Cityscapes 128x128	Train	10	Hier-VRNN
Video Prediction	Cityscapes 128x128	Cond.	2	Hier-VRNN
Video Prediction	Cityscapes 128x128	FVD	567.51	Hier-VRNN
Video Prediction	Cityscapes 128x128	Pred	28	Hier-VRNN
Video Prediction	Cityscapes 128x128	Train	10	Hier-VRNN
Video Generation	BAIR Robot Pushing	Cond	2	Hier-VRNN
Video Generation	BAIR Robot Pushing	FVD score	143.4	Hier-VRNN
Video Generation	BAIR Robot Pushing	Pred	28	Hier-VRNN
Video Generation	BAIR Robot Pushing	Train	10	Hier-VRNN
Video Generation	BAIR Robot Pushing	Cond	2	VRNN 1L
Video Generation	BAIR Robot Pushing	FVD score	149.22	VRNN 1L
Video Generation	BAIR Robot Pushing	Pred	28	VRNN 1L
Video Generation	BAIR Robot Pushing	Train	10	VRNN 1L

Improved Conditional VRNNs for Video Prediction

Abstract

Results

Related Papers

Improved Conditional VRNNs for Video Prediction

Abstract

Results

Related Papers