Tianyu Ding, Luming Liang, Zhihui Zhu, Ilya Zharkov
DNN-based frame interpolation--that generates the intermediate frames given two consecutive frames--typically relies on heavy model architectures with a huge number of features, preventing them from being deployed on systems with limited resources, e.g., mobile devices. We propose a compression-driven network design for frame interpolation (CDFI), that leverages model pruning through sparsity-inducing optimization to significantly reduce the model size while achieving superior performance. Concretely, we first compress the recently proposed AdaCoF model and show that a 10X compressed AdaCoF performs similarly as its original counterpart; then we further improve this compressed model by introducing a multi-resolution warping module, which boosts visual consistencies with multi-level details. As a consequence, we achieve a significant performance gain with only a quarter in size compared with the original AdaCoF. Moreover, our model performs favorably against other state-of-the-arts in a broad range of datasets. Finally, the proposed compression-driven framework is generic and can be easily transferred to other DNN-based frame interpolation algorithm. Our source code is available at https://github.com/tding1/CDFI.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Video | Vimeo90K | LPIPS | 0.01 | CDFI |
| Video | Vimeo90K | PSNR | 35.17 | CDFI |
| Video | Middlebury | LPIPS | 0.007 | CDFI |
| Video | Middlebury | PSNR | 37.14 | CDFI |
| Video | Middlebury | SSIM | 0.966 | CDFI |
| Video | UCF101 | LPIPS | 0.015 | CDFI |
| Video | UCF101 | PSNR | 35.21 | CDFI |
| Video | MSU Video Frame Interpolation | LPIPS | 0.051 | CDFI |
| Video | MSU Video Frame Interpolation | MS-SSIM | 0.926 | CDFI |
| Video | MSU Video Frame Interpolation | PSNR | 26.99 | CDFI |
| Video | MSU Video Frame Interpolation | SSIM | 0.908 | CDFI |
| Video | MSU Video Frame Interpolation | VMAF | 61.72 | CDFI |
| Video Frame Interpolation | Vimeo90K | LPIPS | 0.01 | CDFI |
| Video Frame Interpolation | Vimeo90K | PSNR | 35.17 | CDFI |
| Video Frame Interpolation | Middlebury | LPIPS | 0.007 | CDFI |
| Video Frame Interpolation | Middlebury | PSNR | 37.14 | CDFI |
| Video Frame Interpolation | Middlebury | SSIM | 0.966 | CDFI |
| Video Frame Interpolation | UCF101 | LPIPS | 0.015 | CDFI |
| Video Frame Interpolation | UCF101 | PSNR | 35.21 | CDFI |
| Video Frame Interpolation | MSU Video Frame Interpolation | LPIPS | 0.051 | CDFI |
| Video Frame Interpolation | MSU Video Frame Interpolation | MS-SSIM | 0.926 | CDFI |
| Video Frame Interpolation | MSU Video Frame Interpolation | PSNR | 26.99 | CDFI |
| Video Frame Interpolation | MSU Video Frame Interpolation | SSIM | 0.908 | CDFI |
| Video Frame Interpolation | MSU Video Frame Interpolation | VMAF | 61.72 | CDFI |