Jiwon Kim, Jung Kwon Lee, Kyoung Mu Lee
We present a highly accurate single-image super-resolution (SR) method. Our method uses a very deep convolutional network inspired by VGG-net used for ImageNet classification \cite{simonyan2015very}. We find increasing our network depth shows a significant improvement in accuracy. Our final model uses 20 weight layers. By cascading small filters many times in a deep network structure, contextual information over large image regions is exploited in an efficient way. With very deep networks, however, convergence speed becomes a critical issue during training. We propose a simple yet effective training procedure. We learn residuals only and use extremely high learning rates ($10^4$ times higher than SRCNN \cite{dong2015image}) enabled by adjustable gradient clipping. Our proposed method performs better than existing methods in accuracy and visual improvements in our results are easily noticeable.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Super-Resolution | WebFace - 8x upscaling | PSNR | 23.65 | VDSR |
| Super-Resolution | Set14 - 2x upscaling | PSNR | 33.03 | VDSR [[Kim et al.2016a]] |
| Super-Resolution | IXI | PSNR 2x T2w | 38.65 | VDSR |
| Super-Resolution | IXI | PSNR 4x T2w | 30.79 | VDSR |
| Super-Resolution | IXI | SSIM 4x T2w | 0.924 | VDSR |
| Super-Resolution | IXI | SSIM for 2x T2w | 0.9836 | VDSR |
| Super-Resolution | VggFace2 - 8x upscaling | PSNR | 22.5 | VDSR |
| Super-Resolution | Manga109 - 4x upscaling | PSNR | 28.83 | VDSR |
| Super-Resolution | Manga109 - 4x upscaling | SSIM | 0.887 | VDSR |
| Super-Resolution | Urban100 - 2x upscaling | PSNR | 30.76 | VDSR [[Kim et al.2016a]] |
| Super-Resolution | Set5 - 2x upscaling | PSNR | 37.53 | VDSR [[Kim et al.2016a]] |
| Super-Resolution | MSU Video Upscalers: Quality Enhancement | PSNR | 25.89 | VDSR |
| Super-Resolution | MSU Video Upscalers: Quality Enhancement | SSIM | 0.917 | VDSR |
| Super-Resolution | MSU Video Upscalers: Quality Enhancement | VMAF | 36.46 | VDSR |
| 3D Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | PSNR | 25.89 | VDSR |
| 3D Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | SSIM | 0.917 | VDSR |
| 3D Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | VMAF | 36.46 | VDSR |
| Video | MSU Video Upscalers: Quality Enhancement | PSNR | 25.89 | VDSR |
| Video | MSU Video Upscalers: Quality Enhancement | SSIM | 0.917 | VDSR |
| Video | MSU Video Upscalers: Quality Enhancement | VMAF | 36.46 | VDSR |
| Pose Estimation | MSU Video Upscalers: Quality Enhancement | PSNR | 25.89 | VDSR |
| Pose Estimation | MSU Video Upscalers: Quality Enhancement | SSIM | 0.917 | VDSR |
| Pose Estimation | MSU Video Upscalers: Quality Enhancement | VMAF | 36.46 | VDSR |
| 3D | MSU Video Upscalers: Quality Enhancement | PSNR | 25.89 | VDSR |
| 3D | MSU Video Upscalers: Quality Enhancement | SSIM | 0.917 | VDSR |
| 3D | MSU Video Upscalers: Quality Enhancement | VMAF | 36.46 | VDSR |
| 3D Face Animation | MSU Video Upscalers: Quality Enhancement | PSNR | 25.89 | VDSR |
| 3D Face Animation | MSU Video Upscalers: Quality Enhancement | SSIM | 0.917 | VDSR |
| 3D Face Animation | MSU Video Upscalers: Quality Enhancement | VMAF | 36.46 | VDSR |
| Image Super-Resolution | WebFace - 8x upscaling | PSNR | 23.65 | VDSR |
| Image Super-Resolution | Set14 - 2x upscaling | PSNR | 33.03 | VDSR [[Kim et al.2016a]] |
| Image Super-Resolution | IXI | PSNR 2x T2w | 38.65 | VDSR |
| Image Super-Resolution | IXI | PSNR 4x T2w | 30.79 | VDSR |
| Image Super-Resolution | IXI | SSIM 4x T2w | 0.924 | VDSR |
| Image Super-Resolution | IXI | SSIM for 2x T2w | 0.9836 | VDSR |
| Image Super-Resolution | VggFace2 - 8x upscaling | PSNR | 22.5 | VDSR |
| Image Super-Resolution | Manga109 - 4x upscaling | PSNR | 28.83 | VDSR |
| Image Super-Resolution | Manga109 - 4x upscaling | SSIM | 0.887 | VDSR |
| Image Super-Resolution | Urban100 - 2x upscaling | PSNR | 30.76 | VDSR [[Kim et al.2016a]] |
| Image Super-Resolution | Set5 - 2x upscaling | PSNR | 37.53 | VDSR [[Kim et al.2016a]] |
| 2D Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | PSNR | 25.89 | VDSR |
| 2D Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | SSIM | 0.917 | VDSR |
| 2D Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | VMAF | 36.46 | VDSR |
| 3D Absolute Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | PSNR | 25.89 | VDSR |
| 3D Absolute Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | SSIM | 0.917 | VDSR |
| 3D Absolute Human Pose Estimation | MSU Video Upscalers: Quality Enhancement | VMAF | 36.46 | VDSR |
| Video Super-Resolution | MSU Video Upscalers: Quality Enhancement | PSNR | 25.89 | VDSR |
| Video Super-Resolution | MSU Video Upscalers: Quality Enhancement | SSIM | 0.917 | VDSR |
| Video Super-Resolution | MSU Video Upscalers: Quality Enhancement | VMAF | 36.46 | VDSR |
| 3D Object Super-Resolution | WebFace - 8x upscaling | PSNR | 23.65 | VDSR |
| 3D Object Super-Resolution | Set14 - 2x upscaling | PSNR | 33.03 | VDSR [[Kim et al.2016a]] |
| 3D Object Super-Resolution | IXI | PSNR 2x T2w | 38.65 | VDSR |
| 3D Object Super-Resolution | IXI | PSNR 4x T2w | 30.79 | VDSR |
| 3D Object Super-Resolution | IXI | SSIM 4x T2w | 0.924 | VDSR |
| 3D Object Super-Resolution | IXI | SSIM for 2x T2w | 0.9836 | VDSR |
| 3D Object Super-Resolution | VggFace2 - 8x upscaling | PSNR | 22.5 | VDSR |
| 3D Object Super-Resolution | Manga109 - 4x upscaling | PSNR | 28.83 | VDSR |
| 3D Object Super-Resolution | Manga109 - 4x upscaling | SSIM | 0.887 | VDSR |
| 3D Object Super-Resolution | Urban100 - 2x upscaling | PSNR | 30.76 | VDSR [[Kim et al.2016a]] |
| 3D Object Super-Resolution | Set5 - 2x upscaling | PSNR | 37.53 | VDSR [[Kim et al.2016a]] |
| 3D Object Super-Resolution | MSU Video Upscalers: Quality Enhancement | PSNR | 25.89 | VDSR |
| 3D Object Super-Resolution | MSU Video Upscalers: Quality Enhancement | SSIM | 0.917 | VDSR |
| 3D Object Super-Resolution | MSU Video Upscalers: Quality Enhancement | VMAF | 36.46 | VDSR |
| 1 Image, 2*2 Stitchi | MSU Video Upscalers: Quality Enhancement | PSNR | 25.89 | VDSR |
| 1 Image, 2*2 Stitchi | MSU Video Upscalers: Quality Enhancement | SSIM | 0.917 | VDSR |
| 1 Image, 2*2 Stitchi | MSU Video Upscalers: Quality Enhancement | VMAF | 36.46 | VDSR |
| 16k | WebFace - 8x upscaling | PSNR | 23.65 | VDSR |
| 16k | Set14 - 2x upscaling | PSNR | 33.03 | VDSR [[Kim et al.2016a]] |
| 16k | IXI | PSNR 2x T2w | 38.65 | VDSR |
| 16k | IXI | PSNR 4x T2w | 30.79 | VDSR |
| 16k | IXI | SSIM 4x T2w | 0.924 | VDSR |
| 16k | IXI | SSIM for 2x T2w | 0.9836 | VDSR |
| 16k | VggFace2 - 8x upscaling | PSNR | 22.5 | VDSR |
| 16k | Manga109 - 4x upscaling | PSNR | 28.83 | VDSR |
| 16k | Manga109 - 4x upscaling | SSIM | 0.887 | VDSR |
| 16k | Urban100 - 2x upscaling | PSNR | 30.76 | VDSR [[Kim et al.2016a]] |
| 16k | Set5 - 2x upscaling | PSNR | 37.53 | VDSR [[Kim et al.2016a]] |