TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Deep Unrestricted Document Image Rectification

Deep Unrestricted Document Image Rectification

Hao Feng, Shaokai Liu, Jiajun Deng, Wengang Zhou, Houqiang Li

2023-04-18Local Distortion
PaperPDFCode(official)

Abstract

In recent years, tremendous efforts have been made on document image rectification, but existing advanced algorithms are limited to processing restricted document images, i.e., the input images must incorporate a complete document. Once the captured image merely involves a local text region, its rectification quality is degraded and unsatisfactory. Our previously proposed DocTr, a transformer-assisted network for document image rectification, also suffers from this limitation. In this work, we present DocTr++, a novel unified framework for document image rectification, without any restrictions on the input distorted images. Our major technical improvements can be concluded in three aspects. Firstly, we upgrade the original architecture by adopting a hierarchical encoder-decoder structure for multi-scale representation extraction and parsing. Secondly, we reformulate the pixel-wise mapping relationship between the unrestricted distorted document images and the distortion-free counterparts. The obtained data is used to train our DocTr++ for unrestricted document image rectification. Thirdly, we contribute a real-world test set and metrics applicable for evaluating the rectification quality. To our best knowledge, this is the first learning-based method for the rectification of unrestricted document images. Extensive experiments are conducted, and the results demonstrate the effectiveness and superiority of our method. We hope our DocTr++ will serve as a strong baseline for generic document image rectification, prompting the further advancement and application of learning-based algorithms. The source code and the proposed dataset are publicly available at https://github.com/fh2019ustc/DocTr-Plus.

Results

TaskDatasetMetricValueModel
Local DistortionDocUNetLD7.52DocTr++

Related Papers

Multitask Auxiliary Network for Perceptual Quality Assessment of Non-Uniformly Distorted Omnidirectional Images2025-01-20Decouple Distortion from Perception: Region Adaptive Diffusion for Extreme-low Bitrate Perception Image Compression2025-01-01Distilling Spatially-Heterogeneous Distortion Perception for Blind Image Quality Assessment2025-01-01AccDiffusion v2: Towards More Accurate Higher-Resolution Diffusion Extrapolation2024-12-03GenMapping: Unleashing the Potential of Inverse Perspective Mapping for Robust Online HD Map Construction2024-09-13RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation2024-06-27Boosting Image Quality Assessment through Efficient Transformer Adaptation with Local Feature Enhancement2024-01-01Gappy local conformal auto-encoders for heterogeneous data fusion: in praise of rigidity2023-12-20