Mikhail Papkov, Pavel Chizhov, Leopold Parts
Self-supervised image denoising implies restoring the signal from a noisy image without access to the ground truth. State-of-the-art solutions for this task rely on predicting masked pixels with a fully-convolutional neural network. This most often requires multiple forward passes, information about the noise model, or intricate regularization functions. In this paper, we propose a Swin Transformer-based Image Autoencoder (SwinIA), the first fully-transformer architecture for self-supervised denoising. The flexibility of the attention mechanism helps to fulfill the blind-spot property that convolutional counterparts normally approximate. SwinIA can be trained end-to-end with a simple mean squared error loss without masking and does not require any prior knowledge about clean data or noise distribution. Simple to use, SwinIA establishes the state of the art on several common benchmarks.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Denoising | Kodak24 sigma5-50 | PSNR | 30.3 | SwinIA |
| Denoising | Kodak24 sigma5-50 | SSIM | 0.82 | SwinIA |
| Denoising | BSD300 sigma5-50 | PSNR | 28.4 | SwinIA |
| Denoising | BSD300 sigma5-50 | SSIM | 0.785 | SwinIA |
| Denoising | BSD300 sigma25 | PSNR | 28.4 | SwinIA |
| Denoising | BSD300 sigma25 | SSIM | 0.789 | SwinIA |
| Denoising | Set14 lambda30 | PSNR | 28.74 | SwinIA |
| Denoising | Set14 lambda30 | SSIM | 0.799 | SwinIA |
| Denoising | Kodak24 lambda5-50 | PSNR | 29.06 | SwinIA |
| Denoising | Kodak24 lambda5-50 | SSIM | 0.788 | SwinIA |
| Denoising | Set14 sigma5-50 | PSNR | 29.49 | SwinIA |
| Denoising | Set14 sigma5-50 | SSIM | 0.809 | SwinIA |
| Denoising | Set14 lambda5-50 | PSNR | 28.27 | SwinIA |
| Denoising | Set14 lambda5-50 | SSIM | 0.78 | SwinIA |
| Denoising | Kodak24 sigma25 | PSNR | 30.12 | SwinIA |
| Denoising | Kodak24 sigma25 | SSIM | 0.819 | SwinIA |
| Denoising | Kodak24 lambda30 | PSNR | 29.51 | SwinIA |
| Denoising | Kodak24 lambda30 | SSIM | 0.805 | SwinIA |
| Denoising | BSD300 lambda5-50 | PSNR | 27.74 | SwinIA |
| Denoising | BSD300 lambda5-50 | SSIM | 0.764 | SwinIA |
| Denoising | Set14 sigma25 | PSNR | 29.54 | SwinIA |
| Denoising | Set14 sigma25 | SSIM | 0.814 | SwinIA |
| Denoising | BSD300 lambda30 | PSNR | 27.92 | SwinIA |
| Denoising | BSD300 lambda30 | SSIM | 0.775 | SwinIA |
| Denoising | Set12 sigma50 | PSNR | 26.03 | SwinIA |
| Denoising | Set12 sigma50 | SSIM | 0.736 | SwinIA |
| Denoising | Set12 sigma15 | PSNR | 30.37 | SwinIA |
| Denoising | Set12 sigma15 | SSIM | 0.857 | SwinIA |
| Denoising | BSD68 sigma15 | PSNR | 31.07 | SwinIA |
| Denoising | BSD68 sigma15 | SSIM | 0.856 | SwinIA |
| Denoising | Hanzi | PSNR | 14.35 | SwinIA |
| Denoising | Hanzi | SSIM | 0.556 | SwinIA |
| Denoising | BSD68 sigma25 | PSNR | 29.17 | SwinIA |
| Denoising | BSD68 sigma25 | SSIM | 0.801 | SwinIA |
| Denoising | Set12 sigma25 | PSNR | 28.72 | SwinIA |
| Denoising | Set12 sigma25 | SSIM | 0.817 | SwinIA |
| Denoising | BSD68 sigma50 | PSNR | 26.61 | SwinIA |
| Denoising | BSD68 sigma50 | SSIM | 0.706 | SwinIA |
| Medical Image Denoising | FMD Confocal Fish | PSNR | 31.79 | SwinIA |
| Medical Image Denoising | FMD Confocal Fish | SSIM | 0.871 | SwinIA |
| Medical Image Denoising | FMD Confocal Mice | PSNR | 37.65 | SwinIA |
| Medical Image Denoising | FMD Confocal Mice | SSIM | 0.96 | SwinIA |
| Medical Image Denoising | FMD Two-Photon Mice | PSNR | 33.25 | SwinIA |
| Medical Image Denoising | FMD Two-Photon Mice | SSIM | 0.915 | SwinIA |
| 3D Architecture | Kodak24 sigma5-50 | PSNR | 30.3 | SwinIA |
| 3D Architecture | Kodak24 sigma5-50 | SSIM | 0.82 | SwinIA |
| 3D Architecture | BSD300 sigma5-50 | PSNR | 28.4 | SwinIA |
| 3D Architecture | BSD300 sigma5-50 | SSIM | 0.785 | SwinIA |
| 3D Architecture | BSD300 sigma25 | PSNR | 28.4 | SwinIA |
| 3D Architecture | BSD300 sigma25 | SSIM | 0.789 | SwinIA |
| 3D Architecture | Set14 lambda30 | PSNR | 28.74 | SwinIA |
| 3D Architecture | Set14 lambda30 | SSIM | 0.799 | SwinIA |
| 3D Architecture | Kodak24 lambda5-50 | PSNR | 29.06 | SwinIA |
| 3D Architecture | Kodak24 lambda5-50 | SSIM | 0.788 | SwinIA |
| 3D Architecture | Set14 sigma5-50 | PSNR | 29.49 | SwinIA |
| 3D Architecture | Set14 sigma5-50 | SSIM | 0.809 | SwinIA |
| 3D Architecture | Set14 lambda5-50 | PSNR | 28.27 | SwinIA |
| 3D Architecture | Set14 lambda5-50 | SSIM | 0.78 | SwinIA |
| 3D Architecture | Kodak24 sigma25 | PSNR | 30.12 | SwinIA |
| 3D Architecture | Kodak24 sigma25 | SSIM | 0.819 | SwinIA |
| 3D Architecture | Kodak24 lambda30 | PSNR | 29.51 | SwinIA |
| 3D Architecture | Kodak24 lambda30 | SSIM | 0.805 | SwinIA |
| 3D Architecture | BSD300 lambda5-50 | PSNR | 27.74 | SwinIA |
| 3D Architecture | BSD300 lambda5-50 | SSIM | 0.764 | SwinIA |
| 3D Architecture | Set14 sigma25 | PSNR | 29.54 | SwinIA |
| 3D Architecture | Set14 sigma25 | SSIM | 0.814 | SwinIA |
| 3D Architecture | BSD300 lambda30 | PSNR | 27.92 | SwinIA |
| 3D Architecture | BSD300 lambda30 | SSIM | 0.775 | SwinIA |
| 3D Architecture | Set12 sigma50 | PSNR | 26.03 | SwinIA |
| 3D Architecture | Set12 sigma50 | SSIM | 0.736 | SwinIA |
| 3D Architecture | Set12 sigma15 | PSNR | 30.37 | SwinIA |
| 3D Architecture | Set12 sigma15 | SSIM | 0.857 | SwinIA |
| 3D Architecture | BSD68 sigma15 | PSNR | 31.07 | SwinIA |
| 3D Architecture | BSD68 sigma15 | SSIM | 0.856 | SwinIA |
| 3D Architecture | Hanzi | PSNR | 14.35 | SwinIA |
| 3D Architecture | Hanzi | SSIM | 0.556 | SwinIA |
| 3D Architecture | BSD68 sigma25 | PSNR | 29.17 | SwinIA |
| 3D Architecture | BSD68 sigma25 | SSIM | 0.801 | SwinIA |
| 3D Architecture | Set12 sigma25 | PSNR | 28.72 | SwinIA |
| 3D Architecture | Set12 sigma25 | SSIM | 0.817 | SwinIA |
| 3D Architecture | BSD68 sigma50 | PSNR | 26.61 | SwinIA |
| 3D Architecture | BSD68 sigma50 | SSIM | 0.706 | SwinIA |