Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy
A recurrent structure is a popular framework choice for the task of video super-resolution. The state-of-the-art method BasicVSR adopts bidirectional propagation with feature alignment to effectively exploit information from the entire input video. In this study, we redesign BasicVSR by proposing second-order grid propagation and flow-guided deformable alignment. We show that by empowering the recurrent framework with the enhanced propagation and alignment, one can exploit spatiotemporal information across misaligned video frames more effectively. The new components lead to an improved performance under a similar computational constraint. In particular, our model BasicVSR++ surpasses BasicVSR by 0.82 dB in PSNR with similar number of parameters. In addition to video super-resolution, BasicVSR++ generalizes well to other video restoration tasks such as compressed video enhancement. In NTIRE 2021, BasicVSR++ obtains three champions and one runner-up in the Video Super-Resolution and Compressed Video Enhancement Challenges. Codes and models will be released to MMEditing.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Super-Resolution | MSU Video Upscalers: Quality Enhancement | LPIPS | 0.334 | BasicVsr++RD |
| Super-Resolution | MSU Video Upscalers: Quality Enhancement | PSNR | 30.98 | BasicVsr++RD |
| Super-Resolution | MSU Video Upscalers: Quality Enhancement | SSIM | 0.881 | BasicVsr++RD |
| Super-Resolution | Vid4 - 4x upscaling | PSNR | 27.79 | BasicVSR++ |
| Super-Resolution | Vid4 - 4x upscaling | SSIM | 0.84 | BasicVSR++ |
| Super-Resolution | Vid4 - 4x upscaling - BD degradation | PSNR | 29.04 | BasicVSR++ |
| Super-Resolution | Vid4 - 4x upscaling - BD degradation | SSIM | 0.8753 | BasicVSR++ |
| Super-Resolution | UDM10 - 4x upscaling | PSNR | 40.72 | BasicVSR++ |
| Super-Resolution | UDM10 - 4x upscaling | SSIM | 0.9722 | BasicVSR++ |
| 3D Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | LPIPS | 0.334 | BasicVsr++RD |
| 3D Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | PSNR | 30.98 | BasicVsr++RD |
| 3D Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | SSIM | 0.881 | BasicVsr++RD |
| 3D Human Pose Estimation | Vid4 - 4x upscaling | PSNR | 27.79 | BasicVSR++ |
| 3D Human Pose Estimation | Vid4 - 4x upscaling | SSIM | 0.84 | BasicVSR++ |
| 3D Human Pose Estimation | Vid4 - 4x upscaling - BD degradation | PSNR | 29.04 | BasicVSR++ |
| 3D Human Pose Estimation | Vid4 - 4x upscaling - BD degradation | SSIM | 0.8753 | BasicVSR++ |
| 3D Human Pose Estimation | UDM10 - 4x upscaling | PSNR | 40.72 | BasicVSR++ |
| 3D Human Pose Estimation | UDM10 - 4x upscaling | SSIM | 0.9722 | BasicVSR++ |
| Video | MSU Video Upscalers: Quality Enhancement | LPIPS | 0.334 | BasicVsr++RD |
| Video | MSU Video Upscalers: Quality Enhancement | PSNR | 30.98 | BasicVsr++RD |
| Video | MSU Video Upscalers: Quality Enhancement | SSIM | 0.881 | BasicVsr++RD |
| Video | Vid4 - 4x upscaling | PSNR | 27.79 | BasicVSR++ |
| Video | Vid4 - 4x upscaling | SSIM | 0.84 | BasicVSR++ |
| Video | Vid4 - 4x upscaling - BD degradation | PSNR | 29.04 | BasicVSR++ |
| Video | Vid4 - 4x upscaling - BD degradation | SSIM | 0.8753 | BasicVSR++ |
| Video | UDM10 - 4x upscaling | PSNR | 40.72 | BasicVSR++ |
| Video | UDM10 - 4x upscaling | SSIM | 0.9722 | BasicVSR++ |
| Pose Estimation | MSU Video Upscalers: Quality Enhancement | LPIPS | 0.334 | BasicVsr++RD |
| Pose Estimation | MSU Video Upscalers: Quality Enhancement | PSNR | 30.98 | BasicVsr++RD |
| Pose Estimation | MSU Video Upscalers: Quality Enhancement | SSIM | 0.881 | BasicVsr++RD |
| Pose Estimation | Vid4 - 4x upscaling | PSNR | 27.79 | BasicVSR++ |
| Pose Estimation | Vid4 - 4x upscaling | SSIM | 0.84 | BasicVSR++ |
| Pose Estimation | Vid4 - 4x upscaling - BD degradation | PSNR | 29.04 | BasicVSR++ |
| Pose Estimation | Vid4 - 4x upscaling - BD degradation | SSIM | 0.8753 | BasicVSR++ |
| Pose Estimation | UDM10 - 4x upscaling | PSNR | 40.72 | BasicVSR++ |
| Pose Estimation | UDM10 - 4x upscaling | SSIM | 0.9722 | BasicVSR++ |
| 3D | MSU Video Upscalers: Quality Enhancement | LPIPS | 0.334 | BasicVsr++RD |
| 3D | MSU Video Upscalers: Quality Enhancement | PSNR | 30.98 | BasicVsr++RD |
| 3D | MSU Video Upscalers: Quality Enhancement | SSIM | 0.881 | BasicVsr++RD |
| 3D | Vid4 - 4x upscaling | PSNR | 27.79 | BasicVSR++ |
| 3D | Vid4 - 4x upscaling | SSIM | 0.84 | BasicVSR++ |
| 3D | Vid4 - 4x upscaling - BD degradation | PSNR | 29.04 | BasicVSR++ |
| 3D | Vid4 - 4x upscaling - BD degradation | SSIM | 0.8753 | BasicVSR++ |
| 3D | UDM10 - 4x upscaling | PSNR | 40.72 | BasicVSR++ |
| 3D | UDM10 - 4x upscaling | SSIM | 0.9722 | BasicVSR++ |
| 3D Face Animation | MSU Video Upscalers: Quality Enhancement | LPIPS | 0.334 | BasicVsr++RD |
| 3D Face Animation | MSU Video Upscalers: Quality Enhancement | PSNR | 30.98 | BasicVsr++RD |
| 3D Face Animation | MSU Video Upscalers: Quality Enhancement | SSIM | 0.881 | BasicVsr++RD |
| 3D Face Animation | Vid4 - 4x upscaling | PSNR | 27.79 | BasicVSR++ |
| 3D Face Animation | Vid4 - 4x upscaling | SSIM | 0.84 | BasicVSR++ |
| 3D Face Animation | Vid4 - 4x upscaling - BD degradation | PSNR | 29.04 | BasicVSR++ |
| 3D Face Animation | Vid4 - 4x upscaling - BD degradation | SSIM | 0.8753 | BasicVSR++ |
| 3D Face Animation | UDM10 - 4x upscaling | PSNR | 40.72 | BasicVSR++ |
| 3D Face Animation | UDM10 - 4x upscaling | SSIM | 0.9722 | BasicVSR++ |
| Video Enhancement | MFQE v2 | Incremental PSNR | 1.1 | BasicVSR++ |
| Video Restoration | TAPE | LPIPS | 0.098 | BasicVSR++ |
| Video Restoration | TAPE | PSNR | 31.66 | BasicVSR++ |
| Video Restoration | TAPE | SSIM | 0.916 | BasicVSR++ |
| Video Restoration | TAPE | VMAF | 78.91 | BasicVSR++ |
| 2D Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | LPIPS | 0.334 | BasicVsr++RD |
| 2D Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | PSNR | 30.98 | BasicVsr++RD |
| 2D Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | SSIM | 0.881 | BasicVsr++RD |
| 2D Human Pose Estimation | Vid4 - 4x upscaling | PSNR | 27.79 | BasicVSR++ |
| 2D Human Pose Estimation | Vid4 - 4x upscaling | SSIM | 0.84 | BasicVSR++ |
| 2D Human Pose Estimation | Vid4 - 4x upscaling - BD degradation | PSNR | 29.04 | BasicVSR++ |
| 2D Human Pose Estimation | Vid4 - 4x upscaling - BD degradation | SSIM | 0.8753 | BasicVSR++ |
| 2D Human Pose Estimation | UDM10 - 4x upscaling | PSNR | 40.72 | BasicVSR++ |
| 2D Human Pose Estimation | UDM10 - 4x upscaling | SSIM | 0.9722 | BasicVSR++ |
| 3D Absolute Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | LPIPS | 0.334 | BasicVsr++RD |
| 3D Absolute Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | PSNR | 30.98 | BasicVsr++RD |
| 3D Absolute Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | SSIM | 0.881 | BasicVsr++RD |
| 3D Absolute Human Pose Estimation | Vid4 - 4x upscaling | PSNR | 27.79 | BasicVSR++ |
| 3D Absolute Human Pose Estimation | Vid4 - 4x upscaling | SSIM | 0.84 | BasicVSR++ |
| 3D Absolute Human Pose Estimation | Vid4 - 4x upscaling - BD degradation | PSNR | 29.04 | BasicVSR++ |
| 3D Absolute Human Pose Estimation | Vid4 - 4x upscaling - BD degradation | SSIM | 0.8753 | BasicVSR++ |
| 3D Absolute Human Pose Estimation | UDM10 - 4x upscaling | PSNR | 40.72 | BasicVSR++ |
| 3D Absolute Human Pose Estimation | UDM10 - 4x upscaling | SSIM | 0.9722 | BasicVSR++ |
| Video Super-Resolution | MSU Video Upscalers: Quality Enhancement | LPIPS | 0.334 | BasicVsr++RD |
| Video Super-Resolution | MSU Video Upscalers: Quality Enhancement | PSNR | 30.98 | BasicVsr++RD |
| Video Super-Resolution | MSU Video Upscalers: Quality Enhancement | SSIM | 0.881 | BasicVsr++RD |
| Video Super-Resolution | Vid4 - 4x upscaling | PSNR | 27.79 | BasicVSR++ |
| Video Super-Resolution | Vid4 - 4x upscaling | SSIM | 0.84 | BasicVSR++ |
| Video Super-Resolution | Vid4 - 4x upscaling - BD degradation | PSNR | 29.04 | BasicVSR++ |
| Video Super-Resolution | Vid4 - 4x upscaling - BD degradation | SSIM | 0.8753 | BasicVSR++ |
| Video Super-Resolution | UDM10 - 4x upscaling | PSNR | 40.72 | BasicVSR++ |
| Video Super-Resolution | UDM10 - 4x upscaling | SSIM | 0.9722 | BasicVSR++ |
| 3D Object Super-Resolution | MSU Video Upscalers: Quality Enhancement | LPIPS | 0.334 | BasicVsr++RD |
| 3D Object Super-Resolution | MSU Video Upscalers: Quality Enhancement | PSNR | 30.98 | BasicVsr++RD |
| 3D Object Super-Resolution | MSU Video Upscalers: Quality Enhancement | SSIM | 0.881 | BasicVsr++RD |
| 3D Object Super-Resolution | Vid4 - 4x upscaling | PSNR | 27.79 | BasicVSR++ |
| 3D Object Super-Resolution | Vid4 - 4x upscaling | SSIM | 0.84 | BasicVSR++ |
| 3D Object Super-Resolution | Vid4 - 4x upscaling - BD degradation | PSNR | 29.04 | BasicVSR++ |
| 3D Object Super-Resolution | Vid4 - 4x upscaling - BD degradation | SSIM | 0.8753 | BasicVSR++ |
| 3D Object Super-Resolution | UDM10 - 4x upscaling | PSNR | 40.72 | BasicVSR++ |
| 3D Object Super-Resolution | UDM10 - 4x upscaling | SSIM | 0.9722 | BasicVSR++ |
| Video deraining | VRDS | PSNR | 29.75 | BasicVSR++ |
| Video deraining | VRDS | SSIM | 0.9171 | BasicVSR++ |
| Video deraining | Video Waterdrop Removal Dataset | PSNR | 32.37 | BasicVSR++ |
| Video deraining | Video Waterdrop Removal Dataset | SSIM | 0.9792 | BasicVSR++ |
| 1 Image, 2*2 Stitchi | MSU Video Upscalers: Quality Enhancement | LPIPS | 0.334 | BasicVsr++RD |
| 1 Image, 2*2 Stitchi | MSU Video Upscalers: Quality Enhancement | PSNR | 30.98 | BasicVsr++RD |
| 1 Image, 2*2 Stitchi | MSU Video Upscalers: Quality Enhancement | SSIM | 0.881 | BasicVsr++RD |
| 1 Image, 2*2 Stitchi | Vid4 - 4x upscaling | PSNR | 27.79 | BasicVSR++ |
| 1 Image, 2*2 Stitchi | Vid4 - 4x upscaling | SSIM | 0.84 | BasicVSR++ |
| 1 Image, 2*2 Stitchi | Vid4 - 4x upscaling - BD degradation | PSNR | 29.04 | BasicVSR++ |
| 1 Image, 2*2 Stitchi | Vid4 - 4x upscaling - BD degradation | SSIM | 0.8753 | BasicVSR++ |
| 1 Image, 2*2 Stitchi | UDM10 - 4x upscaling | PSNR | 40.72 | BasicVSR++ |
| 1 Image, 2*2 Stitchi | UDM10 - 4x upscaling | SSIM | 0.9722 | BasicVSR++ |