TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/On Efficient Transformer-Based Image Pre-training for Low-...

On Efficient Transformer-Based Image Pre-training for Low-Level Vision

Wenbo Li, Xin Lu, Shengju Qian, Jiangbo Lu, Xiangyu Zhang, Jiaya Jia

2021-12-19DenoisingSuper-ResolutionImage Super-Resolution
PaperPDFCode(official)

Abstract

Pre-training has marked numerous state of the arts in high-level computer vision, while few attempts have ever been made to investigate how pre-training acts in image processing systems. In this paper, we tailor transformer-based pre-training regimes that boost various low-level tasks. To comprehensively diagnose the influence of pre-training, we design a whole set of principled evaluation tools that uncover its effects on internal representations. The observations demonstrate that pre-training plays strikingly different roles in low-level tasks. For example, pre-training introduces more local information to higher layers in super-resolution (SR), yielding significant performance gains, while pre-training hardly affects internal feature representations in denoising, resulting in limited gains. Further, we explore different methods of pre-training, revealing that multi-related-task pre-training is more effective and data-efficient than other alternatives. Finally, we extend our study to varying data scales and model sizes, as well as comparisons between transformers and CNNs-based architectures. Based on the study, we successfully develop state-of-the-art models for multiple low-level tasks. Code is released at https://github.com/fenglinglwb/EDT.

Results

TaskDatasetMetricValueModel
Super-ResolutionSet5 - 3x upscalingPSNR35.13EDT-B
Super-ResolutionSet5 - 3x upscalingSSIM0.9328EDT-B
Super-ResolutionSet5 - 2x upscalingPSNR38.63EDT-B
Super-ResolutionSet5 - 2x upscalingSSIM0.9632EDT-B
Image Super-ResolutionSet5 - 3x upscalingPSNR35.13EDT-B
Image Super-ResolutionSet5 - 3x upscalingSSIM0.9328EDT-B
Image Super-ResolutionSet5 - 2x upscalingPSNR38.63EDT-B
Image Super-ResolutionSet5 - 2x upscalingSSIM0.9632EDT-B
3D Object Super-ResolutionSet5 - 3x upscalingPSNR35.13EDT-B
3D Object Super-ResolutionSet5 - 3x upscalingSSIM0.9328EDT-B
3D Object Super-ResolutionSet5 - 2x upscalingPSNR38.63EDT-B
3D Object Super-ResolutionSet5 - 2x upscalingSSIM0.9632EDT-B
16kSet5 - 3x upscalingPSNR35.13EDT-B
16kSet5 - 3x upscalingSSIM0.9328EDT-B
16kSet5 - 2x upscalingPSNR38.63EDT-B
16kSet5 - 2x upscalingSSIM0.9632EDT-B

Related Papers

fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models2025-07-17SpectraLift: Physics-Guided Spectral-Inversion Network for Self-Supervised Hyperspectral Image Super-Resolution2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16HUG-VAS: A Hierarchical NURBS-Based Generative Model for Aortic Geometry Synthesis and Controllable Editing2025-07-15AirLLM: Diffusion Policy-based Adaptive LoRA for Remote Fine-Tuning of LLM over the Air2025-07-15IM-LUT: Interpolation Mixing Look-Up Tables for Image Super-Resolution2025-07-14PanoDiff-SR: Synthesizing Dental Panoramic Radiographs using Diffusion and Super-resolution2025-07-12