Jie-En Yao, Li-Yuan Tsao, Yi-Chen Lo, Roy Tseng, Chia-Che Chang, Chun-Yi Lee
Flow-based methods have demonstrated promising results in addressing the ill-posed nature of super-resolution (SR) by learning the distribution of high-resolution (HR) images with the normalizing flow. However, these methods can only perform a predefined fixed-scale SR, limiting their potential in real-world applications. Meanwhile, arbitrary-scale SR has gained more attention and achieved great progress. Nonetheless, previous arbitrary-scale SR methods ignore the ill-posed problem and train the model with per-pixel L1 loss, leading to blurry SR outputs. In this work, we propose "Local Implicit Normalizing Flow" (LINF) as a unified solution to the above problems. LINF models the distribution of texture details under different scaling factors with normalizing flow. Thus, LINF can generate photo-realistic HR images with rich texture details in arbitrary scale factors. We evaluate LINF with extensive experiments and show that LINF achieves the state-of-the-art perceptual quality compared with prior arbitrary-scale SR methods.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Super-Resolution | DIV2K val - 4x upscaling | LPIPS | 0.112 | LINF |
| Super-Resolution | DIV2K val - 4x upscaling | PSNR | 27.33 | LINF |
| Super-Resolution | DIV2K val - 4x upscaling | SSIM | 0.76 | LINF |
| Super-Resolution | DIV2K val - 4x upscaling | LPIPS | 0.248 | LINF t=0.0 |
| Super-Resolution | DIV2K val - 4x upscaling | PSNR | 29.14 | LINF t=0.0 |
| Super-Resolution | DIV2K val - 4x upscaling | SSIM | 0.83 | LINF t=0.0 |
| Image Super-Resolution | DIV2K val - 4x upscaling | LPIPS | 0.112 | LINF |
| Image Super-Resolution | DIV2K val - 4x upscaling | PSNR | 27.33 | LINF |
| Image Super-Resolution | DIV2K val - 4x upscaling | SSIM | 0.76 | LINF |
| Image Super-Resolution | DIV2K val - 4x upscaling | LPIPS | 0.248 | LINF t=0.0 |
| Image Super-Resolution | DIV2K val - 4x upscaling | PSNR | 29.14 | LINF t=0.0 |
| Image Super-Resolution | DIV2K val - 4x upscaling | SSIM | 0.83 | LINF t=0.0 |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | LPIPS | 0.112 | LINF |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | PSNR | 27.33 | LINF |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | SSIM | 0.76 | LINF |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | LPIPS | 0.248 | LINF t=0.0 |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | PSNR | 29.14 | LINF t=0.0 |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | SSIM | 0.83 | LINF t=0.0 |
| 16k | DIV2K val - 4x upscaling | LPIPS | 0.112 | LINF |
| 16k | DIV2K val - 4x upscaling | PSNR | 27.33 | LINF |
| 16k | DIV2K val - 4x upscaling | SSIM | 0.76 | LINF |
| 16k | DIV2K val - 4x upscaling | LPIPS | 0.248 | LINF t=0.0 |
| 16k | DIV2K val - 4x upscaling | PSNR | 29.14 | LINF t=0.0 |
| 16k | DIV2K val - 4x upscaling | SSIM | 0.83 | LINF t=0.0 |