Li-Yuan Tsao, Yi-Chen Lo, Chia-Che Chang, Hao-Wei Chen, Roy Tseng, Chien Feng, Chun-Yi Lee
Flow-based super-resolution (SR) models have demonstrated astonishing capabilities in generating high-quality images. However, these methods encounter several challenges during image generation, such as grid artifacts, exploding inverses, and suboptimal results due to a fixed sampling temperature. To overcome these issues, this work introduces a conditional learned prior to the inference phase of a flow-based SR model. This prior is a latent code predicted by our proposed latent module conditioned on the low-resolution image, which is then transformed by the flow model into an SR image. Our framework is designed to seamlessly integrate with any contemporary flow-based SR model without modifying its architecture or pre-trained weights. We evaluate the effectiveness of our proposed framework through extensive experiments and ablation analyses. The proposed framework successfully addresses all the inherent issues in flow-based SR models and enhances their performance in various SR scenarios. Our code is available at: https://github.com/liyuantsao/BFSR
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Super-Resolution | DIV2K val - 4x upscaling | LPIPS | 0.105 | LINF-LP |
| Super-Resolution | DIV2K val - 4x upscaling | LRPSNR | 47.3 | LINF-LP |
| Super-Resolution | DIV2K val - 4x upscaling | PSNR | 28 | LINF-LP |
| Super-Resolution | DIV2K val - 4x upscaling | SSIM | 0.78 | LINF-LP |
| Super-Resolution | DIV2K val - 4x upscaling | LPIPS | 0.109 | SRFlow-LP |
| Super-Resolution | DIV2K val - 4x upscaling | LRPSNR | 51.51 | SRFlow-LP |
| Super-Resolution | DIV2K val - 4x upscaling | PSNR | 27.51 | SRFlow-LP |
| Super-Resolution | DIV2K val - 4x upscaling | SSIM | 0.78 | SRFlow-LP |
| Image Super-Resolution | DIV2K val - 4x upscaling | LPIPS | 0.105 | LINF-LP |
| Image Super-Resolution | DIV2K val - 4x upscaling | LRPSNR | 47.3 | LINF-LP |
| Image Super-Resolution | DIV2K val - 4x upscaling | PSNR | 28 | LINF-LP |
| Image Super-Resolution | DIV2K val - 4x upscaling | SSIM | 0.78 | LINF-LP |
| Image Super-Resolution | DIV2K val - 4x upscaling | LPIPS | 0.109 | SRFlow-LP |
| Image Super-Resolution | DIV2K val - 4x upscaling | LRPSNR | 51.51 | SRFlow-LP |
| Image Super-Resolution | DIV2K val - 4x upscaling | PSNR | 27.51 | SRFlow-LP |
| Image Super-Resolution | DIV2K val - 4x upscaling | SSIM | 0.78 | SRFlow-LP |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | LPIPS | 0.105 | LINF-LP |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | LRPSNR | 47.3 | LINF-LP |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | PSNR | 28 | LINF-LP |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | SSIM | 0.78 | LINF-LP |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | LPIPS | 0.109 | SRFlow-LP |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | LRPSNR | 51.51 | SRFlow-LP |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | PSNR | 27.51 | SRFlow-LP |
| 3D Object Super-Resolution | DIV2K val - 4x upscaling | SSIM | 0.78 | SRFlow-LP |
| 16k | DIV2K val - 4x upscaling | LPIPS | 0.105 | LINF-LP |
| 16k | DIV2K val - 4x upscaling | LRPSNR | 47.3 | LINF-LP |
| 16k | DIV2K val - 4x upscaling | PSNR | 28 | LINF-LP |
| 16k | DIV2K val - 4x upscaling | SSIM | 0.78 | LINF-LP |
| 16k | DIV2K val - 4x upscaling | LPIPS | 0.109 | SRFlow-LP |
| 16k | DIV2K val - 4x upscaling | LRPSNR | 51.51 | SRFlow-LP |
| 16k | DIV2K val - 4x upscaling | PSNR | 27.51 | SRFlow-LP |
| 16k | DIV2K val - 4x upscaling | SSIM | 0.78 | SRFlow-LP |