Stochastic Adversarial Video Prediction

Alex X. Lee, Richard Zhang, Frederik Ebert, Pieter Abbeel, Chelsea Finn, Sergey Levine

2018-04-04ICLR 2019 5Representation Learning Video Prediction Prediction Video Generation

Abstract

Being able to predict what may happen in the future requires an in-depth understanding of the physical and causal rules that govern the world. A model that is able to do so has a number of appealing applications, from robotic planning to representation learning. However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging -- the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction. Recently, this has been addressed by two distinct approaches: (a) latent variational variable models that explicitly model underlying stochasticity and (b) adversarially-trained models that aim to produce naturalistic images. However, a standard latent variable model can struggle to produce realistic results, and a standard adversarially-trained model underutilizes latent variables and fails to produce diverse predictions. We show that these distinct methods are in fact complementary. Combining the two produces predictions that look more realistic to human raters and better cover the range of possible futures. Our method outperforms prior and concurrent work in these aspects.

Results

Task	Dataset	Metric	Value	Model
Video	BAIR Robot Pushing	Cond	2	SAVP (from FVD)
Video	BAIR Robot Pushing	FVD score	116.4	SAVP (from FVD)
Video	BAIR Robot Pushing	Pred	14	SAVP (from FVD)
Video	BAIR Robot Pushing	Train	14	SAVP (from FVD)
Video	BAIR Robot Pushing	Cond	2	SAVP (from vRNN)
Video	BAIR Robot Pushing	FVD score	143.43	SAVP (from vRNN)
Video	BAIR Robot Pushing	Pred	28	SAVP (from vRNN)
Video	BAIR Robot Pushing	Train	10	SAVP (from vRNN)
Video	BAIR Robot Pushing	Cond	2	SAVP (from SRVP)
Video	BAIR Robot Pushing	Pred	28	SAVP (from SRVP)
Video	BAIR Robot Pushing	Train	12	SAVP (from SRVP)
Video	BAIR Robot Pushing	Cond	2	SAVP-VAE (from WAM)
Video	BAIR Robot Pushing	PSNR	19.09	SAVP-VAE (from WAM)
Video	BAIR Robot Pushing	Pred	28	SAVP-VAE (from WAM)
Video	BAIR Robot Pushing	SSIM	0.815	SAVP-VAE (from WAM)
Video	BAIR Robot Pushing	Train	14	SAVP-VAE (from WAM)
Video	KTH	Cond	10	SAVP-VAE (from Grid-keypoints)
Video	KTH	FVD	145.7	SAVP-VAE (from Grid-keypoints)
Video	KTH	LPIPS	0.116	SAVP-VAE (from Grid-keypoints)
Video	KTH	PSNR	26	SAVP-VAE (from Grid-keypoints)
Video	KTH	Params (M)	7.3	SAVP-VAE (from Grid-keypoints)
Video	KTH	Pred	40	SAVP-VAE (from Grid-keypoints)
Video	KTH	SSIM	0.806	SAVP-VAE (from Grid-keypoints)
Video	KTH	Train	10	SAVP-VAE (from Grid-keypoints)
Video	KTH	Cond	10	SAVP (from Grid-keypoints)
Video	KTH	FVD	183.7	SAVP (from Grid-keypoints)
Video	KTH	LPIPS	0.126	SAVP (from Grid-keypoints)
Video	KTH	PSNR	23.79	SAVP (from Grid-keypoints)
Video	KTH	Params (M)	17.6	SAVP (from Grid-keypoints)
Video	KTH	Pred	40	SAVP (from Grid-keypoints)
Video	KTH	SSIM	0.699	SAVP (from Grid-keypoints)
Video	KTH	Train	10	SAVP (from Grid-keypoints)
Video	KTH	Cond	10	SAVP (from SRVP)
Video	KTH	Pred	30	SAVP (from SRVP)
Video	KTH	Train	10	SAVP (from SRVP)
Video	KTH	Cond	10	SAVP-VAE
Video	KTH	PSNR	27.77	SAVP-VAE
Video	KTH	Pred	20	SAVP-VAE
Video	KTH	SSIM	0.852	SAVP-VAE
Video Prediction	KTH	Cond	10	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	FVD	145.7	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	LPIPS	0.116	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	PSNR	26	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	Params (M)	7.3	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	Pred	40	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	SSIM	0.806	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	Train	10	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	Cond	10	SAVP (from Grid-keypoints)
Video Prediction	KTH	FVD	183.7	SAVP (from Grid-keypoints)
Video Prediction	KTH	LPIPS	0.126	SAVP (from Grid-keypoints)
Video Prediction	KTH	PSNR	23.79	SAVP (from Grid-keypoints)
Video Prediction	KTH	Params (M)	17.6	SAVP (from Grid-keypoints)
Video Prediction	KTH	Pred	40	SAVP (from Grid-keypoints)
Video Prediction	KTH	SSIM	0.699	SAVP (from Grid-keypoints)
Video Prediction	KTH	Train	10	SAVP (from Grid-keypoints)
Video Prediction	KTH	Cond	10	SAVP (from SRVP)
Video Prediction	KTH	Pred	30	SAVP (from SRVP)
Video Prediction	KTH	Train	10	SAVP (from SRVP)
Video Prediction	KTH	Cond	10	SAVP-VAE
Video Prediction	KTH	PSNR	27.77	SAVP-VAE
Video Prediction	KTH	Pred	20	SAVP-VAE
Video Prediction	KTH	SSIM	0.852	SAVP-VAE
Video Generation	BAIR Robot Pushing	Cond	2	SAVP (from FVD)
Video Generation	BAIR Robot Pushing	FVD score	116.4	SAVP (from FVD)
Video Generation	BAIR Robot Pushing	Pred	14	SAVP (from FVD)
Video Generation	BAIR Robot Pushing	Train	14	SAVP (from FVD)
Video Generation	BAIR Robot Pushing	Cond	2	SAVP (from vRNN)
Video Generation	BAIR Robot Pushing	FVD score	143.43	SAVP (from vRNN)
Video Generation	BAIR Robot Pushing	Pred	28	SAVP (from vRNN)
Video Generation	BAIR Robot Pushing	Train	10	SAVP (from vRNN)
Video Generation	BAIR Robot Pushing	Cond	2	SAVP (from SRVP)
Video Generation	BAIR Robot Pushing	Pred	28	SAVP (from SRVP)
Video Generation	BAIR Robot Pushing	Train	12	SAVP (from SRVP)
Video Generation	BAIR Robot Pushing	Cond	2	SAVP-VAE (from WAM)
Video Generation	BAIR Robot Pushing	PSNR	19.09	SAVP-VAE (from WAM)
Video Generation	BAIR Robot Pushing	Pred	28	SAVP-VAE (from WAM)
Video Generation	BAIR Robot Pushing	SSIM	0.815	SAVP-VAE (from WAM)
Video Generation	BAIR Robot Pushing	Train	14	SAVP-VAE (from WAM)

Abstract

Results

Task	Dataset	Metric	Value	Model
Video	BAIR Robot Pushing	Cond	2	SAVP (from FVD)
Video	BAIR Robot Pushing	FVD score	116.4	SAVP (from FVD)
Video	BAIR Robot Pushing	Pred	14	SAVP (from FVD)
Video	BAIR Robot Pushing	Train	14	SAVP (from FVD)
Video	BAIR Robot Pushing	Cond	2	SAVP (from vRNN)
Video	BAIR Robot Pushing	FVD score	143.43	SAVP (from vRNN)
Video	BAIR Robot Pushing	Pred	28	SAVP (from vRNN)
Video	BAIR Robot Pushing	Train	10	SAVP (from vRNN)
Video	BAIR Robot Pushing	Cond	2	SAVP (from SRVP)
Video	BAIR Robot Pushing	Pred	28	SAVP (from SRVP)
Video	BAIR Robot Pushing	Train	12	SAVP (from SRVP)
Video	BAIR Robot Pushing	Cond	2	SAVP-VAE (from WAM)
Video	BAIR Robot Pushing	PSNR	19.09	SAVP-VAE (from WAM)
Video	BAIR Robot Pushing	Pred	28	SAVP-VAE (from WAM)
Video	BAIR Robot Pushing	SSIM	0.815	SAVP-VAE (from WAM)
Video	BAIR Robot Pushing	Train	14	SAVP-VAE (from WAM)
Video	KTH	Cond	10	SAVP-VAE (from Grid-keypoints)
Video	KTH	FVD	145.7	SAVP-VAE (from Grid-keypoints)
Video	KTH	LPIPS	0.116	SAVP-VAE (from Grid-keypoints)
Video	KTH	PSNR	26	SAVP-VAE (from Grid-keypoints)
Video	KTH	Params (M)	7.3	SAVP-VAE (from Grid-keypoints)
Video	KTH	Pred	40	SAVP-VAE (from Grid-keypoints)
Video	KTH	SSIM	0.806	SAVP-VAE (from Grid-keypoints)
Video	KTH	Train	10	SAVP-VAE (from Grid-keypoints)
Video	KTH	Cond	10	SAVP (from Grid-keypoints)
Video	KTH	FVD	183.7	SAVP (from Grid-keypoints)
Video	KTH	LPIPS	0.126	SAVP (from Grid-keypoints)
Video	KTH	PSNR	23.79	SAVP (from Grid-keypoints)
Video	KTH	Params (M)	17.6	SAVP (from Grid-keypoints)
Video	KTH	Pred	40	SAVP (from Grid-keypoints)
Video	KTH	SSIM	0.699	SAVP (from Grid-keypoints)
Video	KTH	Train	10	SAVP (from Grid-keypoints)
Video	KTH	Cond	10	SAVP (from SRVP)
Video	KTH	Pred	30	SAVP (from SRVP)
Video	KTH	Train	10	SAVP (from SRVP)
Video	KTH	Cond	10	SAVP-VAE
Video	KTH	PSNR	27.77	SAVP-VAE
Video	KTH	Pred	20	SAVP-VAE
Video	KTH	SSIM	0.852	SAVP-VAE
Video Prediction	KTH	Cond	10	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	FVD	145.7	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	LPIPS	0.116	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	PSNR	26	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	Params (M)	7.3	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	Pred	40	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	SSIM	0.806	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	Train	10	SAVP-VAE (from Grid-keypoints)
Video Prediction	KTH	Cond	10	SAVP (from Grid-keypoints)
Video Prediction	KTH	FVD	183.7	SAVP (from Grid-keypoints)
Video Prediction	KTH	LPIPS	0.126	SAVP (from Grid-keypoints)
Video Prediction	KTH	PSNR	23.79	SAVP (from Grid-keypoints)
Video Prediction	KTH	Params (M)	17.6	SAVP (from Grid-keypoints)
Video Prediction	KTH	Pred	40	SAVP (from Grid-keypoints)
Video Prediction	KTH	SSIM	0.699	SAVP (from Grid-keypoints)
Video Prediction	KTH	Train	10	SAVP (from Grid-keypoints)
Video Prediction	KTH	Cond	10	SAVP (from SRVP)
Video Prediction	KTH	Pred	30	SAVP (from SRVP)
Video Prediction	KTH	Train	10	SAVP (from SRVP)
Video Prediction	KTH	Cond	10	SAVP-VAE
Video Prediction	KTH	PSNR	27.77	SAVP-VAE
Video Prediction	KTH	Pred	20	SAVP-VAE
Video Prediction	KTH	SSIM	0.852	SAVP-VAE
Video Generation	BAIR Robot Pushing	Cond	2	SAVP (from FVD)
Video Generation	BAIR Robot Pushing	FVD score	116.4	SAVP (from FVD)
Video Generation	BAIR Robot Pushing	Pred	14	SAVP (from FVD)
Video Generation	BAIR Robot Pushing	Train	14	SAVP (from FVD)
Video Generation	BAIR Robot Pushing	Cond	2	SAVP (from vRNN)
Video Generation	BAIR Robot Pushing	FVD score	143.43	SAVP (from vRNN)
Video Generation	BAIR Robot Pushing	Pred	28	SAVP (from vRNN)
Video Generation	BAIR Robot Pushing	Train	10	SAVP (from vRNN)
Video Generation	BAIR Robot Pushing	Cond	2	SAVP (from SRVP)
Video Generation	BAIR Robot Pushing	Pred	28	SAVP (from SRVP)
Video Generation	BAIR Robot Pushing	Train	12	SAVP (from SRVP)
Video Generation	BAIR Robot Pushing	Cond	2	SAVP-VAE (from WAM)
Video Generation	BAIR Robot Pushing	PSNR	19.09	SAVP-VAE (from WAM)
Video Generation	BAIR Robot Pushing	Pred	28	SAVP-VAE (from WAM)
Video Generation	BAIR Robot Pushing	SSIM	0.815	SAVP-VAE (from WAM)
Video Generation	BAIR Robot Pushing	Train	14	SAVP-VAE (from WAM)

Stochastic Adversarial Video Prediction

Abstract

Results

Related Papers

Stochastic Adversarial Video Prediction

Abstract

Results

Related Papers