Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation

Jose Caballero, Christian Ledig, Andrew Aitken, Alejandro Acosta, Johannes Totz, Zehan Wang, Wenzhe Shi

2016-11-16CVPR 2017 7Motion Compensation Video Super-Resolution

Abstract

Convolutional neural networks have enabled accurate image super-resolution in real-time. However, recent attempts to benefit from temporal correlations in video super-resolution have been limited to naive or inefficient architectures. In this paper, we introduce spatio-temporal sub-pixel convolution networks that effectively exploit temporal redundancies and improve reconstruction accuracy while maintaining real-time speed. Specifically, we discuss the use of early fusion, slow fusion and 3D convolutions for the joint processing of multiple consecutive video frames. We also propose a novel joint motion compensation and video super-resolution algorithm that is orders of magnitude more efficient than competing methods, relying on a fast multi-resolution spatial transformer module that is end-to-end trainable. These contributions provide both higher accuracy and temporally more consistent videos, which we confirm qualitatively and quantitatively. Relative to single-frame models, spatio-temporal networks can either reduce the computational cost by 30% whilst maintaining the same quality or provide a 0.2dB gain for a similar computational cost. Results on publicly available datasets demonstrate that the proposed algorithms surpass current state-of-the-art performance in both accuracy and efficiency.

Results

Task	Dataset	Metric	Value	Model
Super-Resolution	MSU Video Upscalers: Quality Enhancement	PSNR	26.92	VESPCN
Super-Resolution	MSU Video Upscalers: Quality Enhancement	SSIM	0.932	VESPCN
Super-Resolution	MSU Video Upscalers: Quality Enhancement	VMAF	53.96	VESPCN
Super-Resolution	Vid4 - 4x upscaling	MOVIE	5.82	VESPCN
Super-Resolution	Vid4 - 4x upscaling	PSNR	25.35	VESPCN
Super-Resolution	Vid4 - 4x upscaling	SSIM	0.7557	VESPCN
Super-Resolution	Vid4 - 4x upscaling	MOVIE	9.31	bicubic
Super-Resolution	Vid4 - 4x upscaling	PSNR	23.82	bicubic
Super-Resolution	Vid4 - 4x upscaling	SSIM	0.6548	bicubic
3D Human Pose Estimation	MSU Video Upscalers: Quality Enhancement	PSNR	26.92	VESPCN
3D Human Pose Estimation	MSU Video Upscalers: Quality Enhancement	SSIM	0.932	VESPCN
3D Human Pose Estimation	MSU Video Upscalers: Quality Enhancement	VMAF	53.96	VESPCN
3D Human Pose Estimation	Vid4 - 4x upscaling	MOVIE	5.82	VESPCN
3D Human Pose Estimation	Vid4 - 4x upscaling	PSNR	25.35	VESPCN
3D Human Pose Estimation	Vid4 - 4x upscaling	SSIM	0.7557	VESPCN
3D Human Pose Estimation	Vid4 - 4x upscaling	MOVIE	9.31	bicubic
3D Human Pose Estimation	Vid4 - 4x upscaling	PSNR	23.82	bicubic
3D Human Pose Estimation	Vid4 - 4x upscaling	SSIM	0.6548	bicubic
Video	MSU Video Upscalers: Quality Enhancement	PSNR	26.92	VESPCN
Video	MSU Video Upscalers: Quality Enhancement	SSIM	0.932	VESPCN
Video	MSU Video Upscalers: Quality Enhancement	VMAF	53.96	VESPCN
Video	Vid4 - 4x upscaling	MOVIE	5.82	VESPCN
Video	Vid4 - 4x upscaling	PSNR	25.35	VESPCN
Video	Vid4 - 4x upscaling	SSIM	0.7557	VESPCN
Video	Vid4 - 4x upscaling	MOVIE	9.31	bicubic
Video	Vid4 - 4x upscaling	PSNR	23.82	bicubic
Video	Vid4 - 4x upscaling	SSIM	0.6548	bicubic
Pose Estimation	MSU Video Upscalers: Quality Enhancement	PSNR	26.92	VESPCN
Pose Estimation	MSU Video Upscalers: Quality Enhancement	SSIM	0.932	VESPCN
Pose Estimation	MSU Video Upscalers: Quality Enhancement	VMAF	53.96	VESPCN
Pose Estimation	Vid4 - 4x upscaling	MOVIE	5.82	VESPCN
Pose Estimation	Vid4 - 4x upscaling	PSNR	25.35	VESPCN
Pose Estimation	Vid4 - 4x upscaling	SSIM	0.7557	VESPCN
Pose Estimation	Vid4 - 4x upscaling	MOVIE	9.31	bicubic
Pose Estimation	Vid4 - 4x upscaling	PSNR	23.82	bicubic
Pose Estimation	Vid4 - 4x upscaling	SSIM	0.6548	bicubic
3D	MSU Video Upscalers: Quality Enhancement	PSNR	26.92	VESPCN
3D	MSU Video Upscalers: Quality Enhancement	SSIM	0.932	VESPCN
3D	MSU Video Upscalers: Quality Enhancement	VMAF	53.96	VESPCN
3D	Vid4 - 4x upscaling	MOVIE	5.82	VESPCN
3D	Vid4 - 4x upscaling	PSNR	25.35	VESPCN
3D	Vid4 - 4x upscaling	SSIM	0.7557	VESPCN
3D	Vid4 - 4x upscaling	MOVIE	9.31	bicubic
3D	Vid4 - 4x upscaling	PSNR	23.82	bicubic
3D	Vid4 - 4x upscaling	SSIM	0.6548	bicubic
3D Face Animation	MSU Video Upscalers: Quality Enhancement	PSNR	26.92	VESPCN
3D Face Animation	MSU Video Upscalers: Quality Enhancement	SSIM	0.932	VESPCN
3D Face Animation	MSU Video Upscalers: Quality Enhancement	VMAF	53.96	VESPCN
3D Face Animation	Vid4 - 4x upscaling	MOVIE	5.82	VESPCN
3D Face Animation	Vid4 - 4x upscaling	PSNR	25.35	VESPCN
3D Face Animation	Vid4 - 4x upscaling	SSIM	0.7557	VESPCN
3D Face Animation	Vid4 - 4x upscaling	MOVIE	9.31	bicubic
3D Face Animation	Vid4 - 4x upscaling	PSNR	23.82	bicubic
3D Face Animation	Vid4 - 4x upscaling	SSIM	0.6548	bicubic
2D Human Pose Estimation	MSU Video Upscalers: Quality Enhancement	PSNR	26.92	VESPCN
2D Human Pose Estimation	MSU Video Upscalers: Quality Enhancement	SSIM	0.932	VESPCN
2D Human Pose Estimation	MSU Video Upscalers: Quality Enhancement	VMAF	53.96	VESPCN
2D Human Pose Estimation	Vid4 - 4x upscaling	MOVIE	5.82	VESPCN
2D Human Pose Estimation	Vid4 - 4x upscaling	PSNR	25.35	VESPCN
2D Human Pose Estimation	Vid4 - 4x upscaling	SSIM	0.7557	VESPCN
2D Human Pose Estimation	Vid4 - 4x upscaling	MOVIE	9.31	bicubic
2D Human Pose Estimation	Vid4 - 4x upscaling	PSNR	23.82	bicubic
2D Human Pose Estimation	Vid4 - 4x upscaling	SSIM	0.6548	bicubic
3D Absolute Human Pose Estimation	MSU Video Upscalers: Quality Enhancement	PSNR	26.92	VESPCN
3D Absolute Human Pose Estimation	MSU Video Upscalers: Quality Enhancement	SSIM	0.932	VESPCN
3D Absolute Human Pose Estimation	MSU Video Upscalers: Quality Enhancement	VMAF	53.96	VESPCN
3D Absolute Human Pose Estimation	Vid4 - 4x upscaling	MOVIE	5.82	VESPCN
3D Absolute Human Pose Estimation	Vid4 - 4x upscaling	PSNR	25.35	VESPCN
3D Absolute Human Pose Estimation	Vid4 - 4x upscaling	SSIM	0.7557	VESPCN
3D Absolute Human Pose Estimation	Vid4 - 4x upscaling	MOVIE	9.31	bicubic
3D Absolute Human Pose Estimation	Vid4 - 4x upscaling	PSNR	23.82	bicubic
3D Absolute Human Pose Estimation	Vid4 - 4x upscaling	SSIM	0.6548	bicubic
Video Super-Resolution	MSU Video Upscalers: Quality Enhancement	PSNR	26.92	VESPCN
Video Super-Resolution	MSU Video Upscalers: Quality Enhancement	SSIM	0.932	VESPCN
Video Super-Resolution	MSU Video Upscalers: Quality Enhancement	VMAF	53.96	VESPCN
Video Super-Resolution	Vid4 - 4x upscaling	MOVIE	5.82	VESPCN
Video Super-Resolution	Vid4 - 4x upscaling	PSNR	25.35	VESPCN
Video Super-Resolution	Vid4 - 4x upscaling	SSIM	0.7557	VESPCN
Video Super-Resolution	Vid4 - 4x upscaling	MOVIE	9.31	bicubic
Video Super-Resolution	Vid4 - 4x upscaling	PSNR	23.82	bicubic
Video Super-Resolution	Vid4 - 4x upscaling	SSIM	0.6548	bicubic
3D Object Super-Resolution	MSU Video Upscalers: Quality Enhancement	PSNR	26.92	VESPCN
3D Object Super-Resolution	MSU Video Upscalers: Quality Enhancement	SSIM	0.932	VESPCN
3D Object Super-Resolution	MSU Video Upscalers: Quality Enhancement	VMAF	53.96	VESPCN
3D Object Super-Resolution	Vid4 - 4x upscaling	MOVIE	5.82	VESPCN
3D Object Super-Resolution	Vid4 - 4x upscaling	PSNR	25.35	VESPCN
3D Object Super-Resolution	Vid4 - 4x upscaling	SSIM	0.7557	VESPCN
3D Object Super-Resolution	Vid4 - 4x upscaling	MOVIE	9.31	bicubic
3D Object Super-Resolution	Vid4 - 4x upscaling	PSNR	23.82	bicubic
3D Object Super-Resolution	Vid4 - 4x upscaling	SSIM	0.6548	bicubic
1 Image, 2*2 Stitchi	MSU Video Upscalers: Quality Enhancement	PSNR	26.92	VESPCN
1 Image, 2*2 Stitchi	MSU Video Upscalers: Quality Enhancement	SSIM	0.932	VESPCN
1 Image, 2*2 Stitchi	MSU Video Upscalers: Quality Enhancement	VMAF	53.96	VESPCN
1 Image, 2*2 Stitchi	Vid4 - 4x upscaling	MOVIE	5.82	VESPCN
1 Image, 2*2 Stitchi	Vid4 - 4x upscaling	PSNR	25.35	VESPCN
1 Image, 2*2 Stitchi	Vid4 - 4x upscaling	SSIM	0.7557	VESPCN
1 Image, 2*2 Stitchi	Vid4 - 4x upscaling	MOVIE	9.31	bicubic
1 Image, 2*2 Stitchi	Vid4 - 4x upscaling	PSNR	23.82	bicubic
1 Image, 2*2 Stitchi	Vid4 - 4x upscaling	SSIM	0.6548	bicubic

Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation

Abstract

Results

Related Papers

Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation

Abstract

Results

Related Papers