TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/No Padding Please: Efficient Neural Handwriting Recognition

No Padding Please: Efficient Neural Handwriting Recognition

Gideon Maillette de Buy Wenniger, Lambert Schomaker, Andy Way

2019-02-28Handwriting RecognitionHandwritten Text Recognition
PaperPDFCode

Abstract

Neural handwriting recognition (NHR) is the recognition of handwritten text with deep learning models, such as multi-dimensional long short-term memory (MDLSTM) recurrent neural networks. Models with MDLSTM layers have achieved state-of-the art results on handwritten text recognition tasks. While multi-directional MDLSTM-layers have an unbeaten ability to capture the complete context in all directions, this strength limits the possibilities for parallelization, and therefore comes at a high computational cost. In this work we develop methods to create efficient MDLSTM-based models for NHR, particularly a method aimed at eliminating computation waste that results from padding. This proposed method, called example-packing, replaces wasteful stacking of padded examples with efficient tiling in a 2-dimensional grid. For word-based NHR this yields a speed improvement of factor 6.6 over an already efficient baseline of minimal padding for each batch separately. For line-based NHR the savings are more modest, but still significant. In addition to example-packing, we propose: 1) a technique to optimize parallelization for dynamic graph definition frameworks including PyTorch, using convolutions with grouping, 2) a method for parallelization across GPUs for variable-length example batches. All our techniques are thoroughly tested on our own PyTorch re-implementation of MDLSTM-based NHR models. A thorough evaluation on the IAM dataset shows that our models are performing similar to earlier implementations of state-of-the-art models. Our efficient NHR model and some of the reusable techniques discussed with it offer ways to realize relatively efficient models for the omnipresent scenario of variable-length inputs in deep learning.

Results

TaskDatasetMetricValueModel
Optical Character Recognition (OCR)IAMCER6.6Leaky LP Cell
Optical Character Recognition (OCR)IAMWER15.9Leaky LP Cell
Handwritten Text RecognitionIAMCER6.6Leaky LP Cell
Handwritten Text RecognitionIAMWER15.9Leaky LP Cell

Related Papers

Advancing Offline Handwritten Text Recognition: A Systematic Review of Data Augmentation and Generation Techniques2025-07-08A Transformer Based Handwriting Recognition System Jointly Using Online and Offline Features2025-06-25Learning to Align: Addressing Character Frequency Distribution Shifts in Handwritten Text Recognition2025-06-11Creating a Historical Migration Dataset from Finnish Church Records, 1800-19202025-06-09MetaWriter: Personalized Handwritten Text Recognition Using Meta-Learned Prompt Tuning2025-05-26Preserving Privacy Without Compromising Accuracy: Machine Unlearning for Handwritten Text Recognition2025-04-11Meta-DAN: towards an efficient prediction strategy for page-level handwritten text recognition2025-04-04TRIDIS: A Comprehensive Medieval and Early Modern Corpus for HTR and NER2025-03-25