TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Vi...

Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution

Xiaoyu Xiang, Yapeng Tian, Yulun Zhang, Yun Fu, Jan P. Allebach, Chenliang Xu

2020-02-26CVPR 2020 6Super-ResolutionVideo Super-ResolutionVideo Frame InterpolationSpace-time Video Super-resolution
PaperPDFCodeCode(official)Code

Abstract

In this paper, we explore the space-time video super-resolution task, which aims to generate a high-resolution (HR) slow-motion video from a low frame rate (LFR), low-resolution (LR) video. A simple solution is to split it into two sub-tasks: video frame interpolation (VFI) and video super-resolution (VSR). However, temporal interpolation and spatial super-resolution are intra-related in this task. Two-stage methods cannot fully take advantage of the natural property. In addition, state-of-the-art VFI or VSR networks require a large frame-synthesis or reconstruction module for predicting high-quality video frames, which makes the two-stage methods have large model sizes and thus be time-consuming. To overcome the problems, we propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video. Rather than synthesizing missing LR video frames as VFI networks do, we firstly temporally interpolate LR frame features in missing LR video frames capturing local temporal contexts by the proposed feature temporal interpolation network. Then, we propose a deformable ConvLSTM to align and aggregate temporal information simultaneously for better leveraging global temporal contexts. Finally, a deep reconstruction network is adopted to predict HR slow-motion video frames. Extensive experiments on benchmark datasets demonstrate that the proposed method not only achieves better quantitative and qualitative performance but also is more than three times faster than recent two-stage state-of-the-art methods, e.g., DAIN+EDVR and DAIN+RBPN.

Results

TaskDatasetMetricValueModel
VideoVid4 - 4x upscalingPSNR26.31Zooming Slow-Mo
VideoVid4 - 4x upscalingParameters11100000Zooming Slow-Mo
VideoVid4 - 4x upscalingSSIM0.7976Zooming Slow-Mo
VideoVid4 - 4x upscalingruntime (s)0.0606Zooming Slow-Mo
Video Frame InterpolationVid4 - 4x upscalingPSNR26.31Zooming Slow-Mo
Video Frame InterpolationVid4 - 4x upscalingParameters11100000Zooming Slow-Mo
Video Frame InterpolationVid4 - 4x upscalingSSIM0.7976Zooming Slow-Mo
Video Frame InterpolationVid4 - 4x upscalingruntime (s)0.0606Zooming Slow-Mo

Related Papers

SpectraLift: Physics-Guided Spectral-Inversion Network for Self-Supervised Hyperspectral Image Super-Resolution2025-07-17IM-LUT: Interpolation Mixing Look-Up Tables for Image Super-Resolution2025-07-14PanoDiff-SR: Synthesizing Dental Panoramic Radiographs using Diffusion and Super-resolution2025-07-12HNOSeg-XS: Extremely Small Hartley Neural Operator for Efficient and Resolution-Robust 3D Image Segmentation2025-07-104KAgent: Agentic Any Image to 4K Super-Resolution2025-07-09TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation2025-07-07EAMamba: Efficient All-Around Vision State Space Model for Image Restoration2025-06-27Leveraging Vision-Language Models to Select Trustworthy Super-Resolution Samples Generated by Diffusion Models2025-06-25