TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/NAMER: Non-Autoregressive Modeling for Handwritten Mathema...

NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition

Chenyu Liu, Jia Pan, Jinshui Hu, BaoCai Yin, Bing Yin, Mingjun Chen, Cong Liu, Jun Du, Qingfeng Liu

2024-07-16Handwritten Mathmatical Expression Recognitiondocument understanding
PaperPDF

Abstract

Recently, Handwritten Mathematical Expression Recognition (HMER) has gained considerable attention in pattern recognition for its diverse applications in document understanding. Current methods typically approach HMER as an image-to-sequence generation task within an autoregressive (AR) encoder-decoder framework. However, these approaches suffer from several drawbacks: 1) a lack of overall language context, limiting information utilization beyond the current decoding step; 2) error accumulation during AR decoding; and 3) slow decoding speed. To tackle these problems, this paper makes the first attempt to build a novel bottom-up Non-AutoRegressive Modeling approach for HMER, called NAMER. NAMER comprises a Visual Aware Tokenizer (VAT) and a Parallel Graph Decoder (PGD). Initially, the VAT tokenizes visible symbols and local relations at a coarse level. Subsequently, the PGD refines all tokens and establishes connectivities in parallel, leveraging comprehensive visual and linguistic contexts. Experiments on CROHME 2014/2016/2019 and HME100K datasets demonstrate that NAMER not only outperforms the current state-of-the-art (SOTA) methods on ExpRate by 1.93%/2.35%/1.49%/0.62%, but also achieves significant speedups of 13.7x and 6.7x faster in decoding time and overall FPS, proving the effectiveness and efficiency of NAMER.

Results

TaskDatasetMetricValueModel
Handwritten Mathmatical Expression RecognitionCROHME 2016ExpRate60.24NAMER
Handwritten Mathmatical Expression RecognitionHME100KExpRate68.52NAMER
Handwritten Mathmatical Expression RecognitionCROHME 2019ExpRate61.72NAMER
Handwritten Mathmatical Expression RecognitionCROHME 2014ExpRate60.51NAMER

Related Papers

A Survey on MLLM-based Visually Rich Document Understanding: Methods, Challenges, and Emerging Trends2025-07-14PaddleOCR 3.0 Technical Report2025-07-08GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning2025-07-01Class-Agnostic Region-of-Interest Matching in Document Images2025-06-26DrishtiKon: Multi-Granular Visual Grounding for Text-Rich Document Images2025-06-26Seeing is Believing? Mitigating OCR Hallucinations in Multimodal Large Language Models2025-06-25PP-DocBee2: Improved Baselines with Efficient Data for Multimodal Document Understanding2025-06-22WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts2025-06-18