PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning

Yunbo Wang, Zhifeng Gao, Mingsheng Long, Jian-Min Wang, Philip S. Yu

2018-04-17ICML 2018 7Video Prediction

Paper PDF Code Code Code Code Code Code Code(official)Code Code Code Code

Abstract

We present PredRNN++, an improved recurrent network for video predictive learning. In pursuit of a greater spatiotemporal modeling capability, our approach increases the transition depth between adjacent states by leveraging a novel recurrent unit, which is named Causal LSTM for re-organizing the spatial and temporal memories in a cascaded mechanism. However, there is still a dilemma in video predictive learning: increasingly deep-in-time models have been designed for capturing complex variations, while introducing more difficulties in the gradient back-propagation. To alleviate this undesirable effect, we propose a Gradient Highway architecture, which provides alternative shorter routes for gradient flows from outputs back to long-range inputs. This architecture works seamlessly with causal LSTMs, enabling PredRNN++ to capture short-term and long-term dependencies adaptively. We assess our model on both synthetic and real video datasets, showing its ability to ease the vanishing gradient problem and yield state-of-the-art prediction results even in a difficult objects occlusion scenario.

Results

Task	Dataset	Metric	Value	Model
Video	Moving MNIST	MAE	106.8	Causal LSTM
Video	Moving MNIST	MSE	46.5	Causal LSTM
Video	Moving MNIST	SSIM	0.898	Causal LSTM
Video	KTH	Cond	10	PredRNN++
Video	KTH	PSNR	28.47	PredRNN++
Video	KTH	Pred	20	PredRNN++
Video	KTH	SSIM	0.865	PredRNN++
Video	SynpickVP	LPIPS	0.053	PredRNN++
Video	SynpickVP	MSE	51.73	PredRNN++
Video	SynpickVP	PSNR	27.5	PredRNN++
Video	SynpickVP	SSIM	0.894	PredRNN++
Video Prediction	Moving MNIST	MAE	106.8	Causal LSTM
Video Prediction	Moving MNIST	MSE	46.5	Causal LSTM
Video Prediction	Moving MNIST	SSIM	0.898	Causal LSTM
Video Prediction	KTH	Cond	10	PredRNN++
Video Prediction	KTH	PSNR	28.47	PredRNN++
Video Prediction	KTH	Pred	20	PredRNN++
Video Prediction	KTH	SSIM	0.865	PredRNN++
Video Prediction	SynpickVP	LPIPS	0.053	PredRNN++
Video Prediction	SynpickVP	MSE	51.73	PredRNN++
Video Prediction	SynpickVP	PSNR	27.5	PredRNN++
Video Prediction	SynpickVP	SSIM	0.894	PredRNN++

PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning

Abstract

Results

Related Papers

PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning

Abstract

Results

Related Papers