TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/HandReader: Advanced Techniques for Efficient Fingerspelli...

HandReader: Advanced Techniques for Efficient Fingerspelling Recognition

Pavel Korotaev, Petr Surovtsev, Alexander Kapitanov, Karina Kvanchiani, Aleksandr Nagaev

2025-05-15Sign Language Recognition
PaperPDFCode(official)

Abstract

Fingerspelling is a significant component of Sign Language (SL), allowing the interpretation of proper names, characterized by fast hand movements during signing. Although previous works on fingerspelling recognition have focused on processing the temporal dimension of videos, there remains room for improving the accuracy of these approaches. This paper introduces HandReader, a group of three architectures designed to address the fingerspelling recognition task. HandReader$_{RGB}$ employs the novel Temporal Shift-Adaptive Module (TSAM) to process RGB features from videos of varying lengths while preserving important sequential information. HandReader$_{KP}$ is built on the proposed Temporal Pose Encoder (TPE) operated on keypoints as tensors. Such keypoints composition in a batch allows the encoder to pass them through 2D and 3D convolution layers, utilizing temporal and spatial information and accumulating keypoints coordinates. We also introduce HandReader_RGB+KP - architecture with a joint encoder to benefit from RGB and keypoint modalities. Each HandReader model possesses distinct advantages and achieves state-of-the-art results on the ChicagoFSWild and ChicagoFSWild+ datasets. Moreover, the models demonstrate high performance on the first open dataset for Russian fingerspelling, Znaki, presented in this paper. The Znaki dataset and HandReader pre-trained models are publicly available.

Results

TaskDatasetMetricValueModel
Sign Language RecognitionChicagoFSWild+CER (%)24.4HandReader_RGB+KP
Sign Language RecognitionChicagoFSWild+CER (%)26.2HandReader_KP
Sign Language RecognitionChicagoFSWild+CER (%)27.6HandReader_RGB
Sign Language RecognitionChicagoFSWildCER (%)27.1HandReader_RGB_KP
Sign Language RecognitionChicagoFSWildCER (%)28HandReader_KP
Sign Language RecognitionChicagoFSWildCER (%)30.7HandReader_RGB
Sign Language RecognitionZnakiCER (%)5.06HandReader_RGB_KP
Sign Language RecognitionZnakiCER (%)7.35HandReader_KP
Sign Language RecognitionZnakiCER (%)7.61HandReader_RGB

Related Papers

Hierarchical Sub-action Tree for Continuous Sign Language Recognition2025-06-26SignBart -- New approach with the skeleton sequence for Isolated Sign language Recognition2025-06-18SLRNet: A Real-Time LSTM-Based Sign Language Recognition System2025-06-11Fine-Tuning Video Transformers for Word-Level Bangla Sign Language: A Comparative Analysis for Classification Tasks2025-06-04Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language2025-05-20Enhancing Mathematics Learning for Hard-of-Hearing Students Through Real-Time Palestinian Sign Language Recognition: A New Dataset2025-05-16Logos as a Well-Tempered Pre-train for Sign Language Recognition2025-05-15TSLFormer: A Lightweight Transformer Model for Turkish Sign Language Recognition Using Skeletal Landmarks2025-05-11