Hanzhou Liu, Chengkai Liu, Jiacong Xu, Peng Jiang, Mi Lu
Deep state-space models (SSMs), like recent Mamba architectures, are emerging as a promising alternative to CNN and Transformer networks. Existing Mamba-based restoration methods process visual data by leveraging a flatten-and-scan strategy that converts image patches into a 1D sequence before scanning. However, this scanning paradigm ignores local pixel dependencies and introduces spatial misalignment by positioning distant pixels incorrectly adjacent, which reduces local noise-awareness and degrades image sharpness in low-level vision tasks. To overcome these issues, we propose a novel slice-and-scan strategy that alternates scanning along intra- and inter-slices. We further design a new Vision State Space Module (VSSM) for image deblurring, and tackle the inefficiency challenges of the current Mamba-based vision module. Building upon this, we develop XYScanNet, an SSM architecture integrated with a lightweight feature fusion module for enhanced image deblurring. XYScanNet, maintains competitive distortion metrics and significantly improves perceptual performance. Experimental results show that XYScanNet enhances KID by $17\%$ compared to the nearest competitor.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Deblurring | GoPro | PSNR | 33.91 | XYScanNet |
| Deblurring | GoPro | SSIM | 0.968 | XYScanNet |
| Deblurring | HIDE (trained on GOPRO) | PSNR (sRGB) | 31.74 | XYScanNet |
| Deblurring | HIDE (trained on GOPRO) | SSIM (sRGB) | 0.947 | XYScanNet |
| 2D Classification | GoPro | PSNR | 33.91 | XYScanNet |
| 2D Classification | GoPro | SSIM | 0.968 | XYScanNet |
| 2D Classification | HIDE (trained on GOPRO) | PSNR (sRGB) | 31.74 | XYScanNet |
| 2D Classification | HIDE (trained on GOPRO) | SSIM (sRGB) | 0.947 | XYScanNet |
| 10-shot image generation | GoPro | PSNR | 33.91 | XYScanNet |
| 10-shot image generation | GoPro | SSIM | 0.968 | XYScanNet |
| 10-shot image generation | HIDE (trained on GOPRO) | PSNR (sRGB) | 31.74 | XYScanNet |
| 10-shot image generation | HIDE (trained on GOPRO) | SSIM (sRGB) | 0.947 | XYScanNet |
| Blind Image Deblurring | GoPro | PSNR | 33.91 | XYScanNet |
| Blind Image Deblurring | GoPro | SSIM | 0.968 | XYScanNet |
| Blind Image Deblurring | HIDE (trained on GOPRO) | PSNR (sRGB) | 31.74 | XYScanNet |
| Blind Image Deblurring | HIDE (trained on GOPRO) | SSIM (sRGB) | 0.947 | XYScanNet |