Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, Houqiang Li
In this paper, we present Uformer, an effective and efficient Transformer-based architecture for image restoration, in which we build a hierarchical encoder-decoder network using the Transformer block. In Uformer, there are two core designs. First, we introduce a novel locally-enhanced window (LeWin) Transformer block, which performs nonoverlapping window-based self-attention instead of global self-attention. It significantly reduces the computational complexity on high resolution feature map while capturing local context. Second, we propose a learnable multi-scale restoration modulator in the form of a multi-scale spatial bias to adjust features in multiple layers of the Uformer decoder. Our modulator demonstrates superior capability for restoring details for various image restoration tasks while introducing marginal extra parameters and computational cost. Powered by these two designs, Uformer enjoys a high capability for capturing both local and global dependencies for image restoration. To evaluate our approach, extensive experiments are conducted on several image restoration tasks, including image denoising, motion deblurring, defocus deblurring and deraining. Without bells and whistles, our Uformer achieves superior or comparable performance compared with the state-of-the-art algorithms. The code and models are available at https://github.com/ZhendongWang6/Uformer.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Deblurring | GoPro | PSNR | 32.97 | Uformer-B |
| Deblurring | GoPro | SSIM | 0.967 | Uformer-B |
| Deblurring | RealBlur-R (trained on GoPro) | PSNR (sRGB) | 36.22 | Uformer-B |
| Deblurring | RealBlur-R (trained on GoPro) | SSIM (sRGB) | 0.957 | Uformer-B |
| Deblurring | RealBlur-J (trained on GoPro) | PSNR (sRGB) | 29.06 | Uformer-B |
| Deblurring | RealBlur-J (trained on GoPro) | SSIM (sRGB) | 0.884 | Uformer-B |
| Deblurring | HIDE (trained on GOPRO) | PSNR (sRGB) | 30.83 | Uformer-B |
| Deblurring | HIDE (trained on GOPRO) | Params (M) | 50.88 | Uformer-B |
| Deblurring | HIDE (trained on GOPRO) | SSIM (sRGB) | 0.952 | Uformer-B |
| Deblurring | RSBlur | Average PSNR | 33.98 | Uformer-B |
| Image Enhancement | TIP 2018 | PSNR | 29.28 | Uformer-B |
| Image Enhancement | TIP 2018 | SSIM | 0.917 | Uformer-B |
| Dehazing | SOTS Indoor | PSNR | 31.91 | Uformer |
| Dehazing | SOTS Indoor | SSIM | 0.971 | Uformer |
| Dehazing | SOTS Outdoor | PSNR | 26.52 | Uformer |
| Dehazing | SOTS Outdoor | SSIM | 0.945 | Uformer |
| Image Restoration | CSD | Average PSNR (dB) | 33.8 | UFormer |
| Image Dehazing | SOTS Indoor | PSNR | 31.91 | Uformer |
| Image Dehazing | SOTS Indoor | SSIM | 0.971 | Uformer |
| Image Dehazing | SOTS Outdoor | PSNR | 26.52 | Uformer |
| Image Dehazing | SOTS Outdoor | SSIM | 0.945 | Uformer |
| Denoising | SIDD | PSNR (sRGB) | 39.89 | Uformer-B |
| Denoising | SIDD | SSIM (sRGB) | 0.96 | Uformer-B |
| Denoising | DND | PSNR (sRGB) | 39.98 | Uformer-B |
| Denoising | DND | SSIM (sRGB) | 0.955 | Uformer-B |
| Image Denoising | SIDD | PSNR (sRGB) | 39.89 | Uformer-B |
| Image Denoising | SIDD | SSIM (sRGB) | 0.96 | Uformer-B |
| Image Denoising | DND | PSNR (sRGB) | 39.98 | Uformer-B |
| Image Denoising | DND | SSIM (sRGB) | 0.955 | Uformer-B |
| 2D Classification | GoPro | PSNR | 32.97 | Uformer-B |
| 2D Classification | GoPro | SSIM | 0.967 | Uformer-B |
| 2D Classification | RealBlur-R (trained on GoPro) | PSNR (sRGB) | 36.22 | Uformer-B |
| 2D Classification | RealBlur-R (trained on GoPro) | SSIM (sRGB) | 0.957 | Uformer-B |
| 2D Classification | RealBlur-J (trained on GoPro) | PSNR (sRGB) | 29.06 | Uformer-B |
| 2D Classification | RealBlur-J (trained on GoPro) | SSIM (sRGB) | 0.884 | Uformer-B |
| 2D Classification | HIDE (trained on GOPRO) | PSNR (sRGB) | 30.83 | Uformer-B |
| 2D Classification | HIDE (trained on GOPRO) | Params (M) | 50.88 | Uformer-B |
| 2D Classification | HIDE (trained on GOPRO) | SSIM (sRGB) | 0.952 | Uformer-B |
| 2D Classification | RSBlur | Average PSNR | 33.98 | Uformer-B |
| Image Deblurring | GoPro | PSNR | 32.97 | Uformer-B |
| Image Deblurring | GoPro | Params (M) | 50.88 | Uformer-B |
| Image Deblurring | GoPro | SSIM | 0.967 | Uformer-B |
| 3D Architecture | SIDD | PSNR (sRGB) | 39.89 | Uformer-B |
| 3D Architecture | SIDD | SSIM (sRGB) | 0.96 | Uformer-B |
| 3D Architecture | DND | PSNR (sRGB) | 39.98 | Uformer-B |
| 3D Architecture | DND | SSIM (sRGB) | 0.955 | Uformer-B |
| 10-shot image generation | CSD | Average PSNR (dB) | 33.8 | UFormer |
| 10-shot image generation | GoPro | PSNR | 32.97 | Uformer-B |
| 10-shot image generation | GoPro | SSIM | 0.967 | Uformer-B |
| 10-shot image generation | RealBlur-R (trained on GoPro) | PSNR (sRGB) | 36.22 | Uformer-B |
| 10-shot image generation | RealBlur-R (trained on GoPro) | SSIM (sRGB) | 0.957 | Uformer-B |
| 10-shot image generation | RealBlur-J (trained on GoPro) | PSNR (sRGB) | 29.06 | Uformer-B |
| 10-shot image generation | RealBlur-J (trained on GoPro) | SSIM (sRGB) | 0.884 | Uformer-B |
| 10-shot image generation | HIDE (trained on GOPRO) | PSNR (sRGB) | 30.83 | Uformer-B |
| 10-shot image generation | HIDE (trained on GOPRO) | Params (M) | 50.88 | Uformer-B |
| 10-shot image generation | HIDE (trained on GOPRO) | SSIM (sRGB) | 0.952 | Uformer-B |
| 10-shot image generation | RSBlur | Average PSNR | 33.98 | Uformer-B |
| 10-shot image generation | GoPro | PSNR | 32.97 | Uformer-B |
| 10-shot image generation | GoPro | Params (M) | 50.88 | Uformer-B |
| 10-shot image generation | GoPro | SSIM | 0.967 | Uformer-B |
| 1 Image, 2*2 Stitchi | GoPro | PSNR | 32.97 | Uformer-B |
| 1 Image, 2*2 Stitchi | GoPro | Params (M) | 50.88 | Uformer-B |
| 1 Image, 2*2 Stitchi | GoPro | SSIM | 0.967 | Uformer-B |
| 16k | GoPro | PSNR | 32.97 | Uformer-B |
| 16k | GoPro | Params (M) | 50.88 | Uformer-B |
| 16k | GoPro | SSIM | 0.967 | Uformer-B |
| Blind Image Deblurring | GoPro | PSNR | 32.97 | Uformer-B |
| Blind Image Deblurring | GoPro | SSIM | 0.967 | Uformer-B |
| Blind Image Deblurring | RealBlur-R (trained on GoPro) | PSNR (sRGB) | 36.22 | Uformer-B |
| Blind Image Deblurring | RealBlur-R (trained on GoPro) | SSIM (sRGB) | 0.957 | Uformer-B |
| Blind Image Deblurring | RealBlur-J (trained on GoPro) | PSNR (sRGB) | 29.06 | Uformer-B |
| Blind Image Deblurring | RealBlur-J (trained on GoPro) | SSIM (sRGB) | 0.884 | Uformer-B |
| Blind Image Deblurring | HIDE (trained on GOPRO) | PSNR (sRGB) | 30.83 | Uformer-B |
| Blind Image Deblurring | HIDE (trained on GOPRO) | Params (M) | 50.88 | Uformer-B |
| Blind Image Deblurring | HIDE (trained on GOPRO) | SSIM (sRGB) | 0.952 | Uformer-B |
| Blind Image Deblurring | RSBlur | Average PSNR | 33.98 | Uformer-B |