TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/SignBERT+: Hand-model-aware Self-supervised Pre-training f...

SignBERT+: Hand-model-aware Self-supervised Pre-training for Sign Language Understanding

Hezhen Hu, Weichao Zhao, Wengang Zhou, Houqiang Li

2023-05-08Sign Language TranslationSelf-Supervised LearningSign Language Recognition
PaperPDF

Abstract

Hand gesture serves as a crucial role during the expression of sign language. Current deep learning based methods for sign language understanding (SLU) are prone to over-fitting due to insufficient sign data resource and suffer limited interpretability. In this paper, we propose the first self-supervised pre-trainable SignBERT+ framework with model-aware hand prior incorporated. In our framework, the hand pose is regarded as a visual token, which is derived from an off-the-shelf detector. Each visual token is embedded with gesture state and spatial-temporal position encoding. To take full advantage of current sign data resource, we first perform self-supervised learning to model its statistics. To this end, we design multi-level masked modeling strategies (joint, frame and clip) to mimic common failure detection cases. Jointly with these masked modeling strategies, we incorporate model-aware hand prior to better capture hierarchical context over the sequence. After the pre-training, we carefully design simple yet effective prediction heads for downstream tasks. To validate the effectiveness of our framework, we perform extensive experiments on three main SLU tasks, involving isolated and continuous sign language recognition (SLR), and sign language translation (SLT). Experimental results demonstrate the effectiveness of our method, achieving new state-of-the-art performance with a notable gain.

Results

TaskDatasetMetricValueModel
Sign Language TranslationRWTH-PHOENIX-Weather 2014 TBLEU-425.7SignBERT+
Sign Language RecognitionWLASLTop-1 Accuracy55.59SignBERT+
Sign Language RecognitionRWTH-PHOENIX-Weather 2014Word Error Rate (WER)20SignBERT+
Sign Language RecognitionRWTH-PHOENIX-Weather 2014 TWord Error Rate (WER)19.9SignBERT+
Sign Language RecognitionMSASL-1000P-C Top-1 Accuracy70.77SignBERT+
Sign Language RecognitionMSASL-1000P-I Top-1 Accuracy73.71SignBERT+

Related Papers

A Semi-Supervised Learning Method for the Identification of Bad Exposures in Large Imaging Surveys2025-07-17Self-supervised Learning on Camera Trap Footage Yields a Strong Universal Face Embedder2025-07-14Speak2Sign3D: A Multi-modal Pipeline for English Speech to American Sign Language Animation2025-07-09Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis2025-07-08World4Drive: End-to-End Autonomous Driving via Intention-aware Physical Latent World Model2025-07-01ShapeEmbed: a self-supervised learning framework for 2D contour quantification2025-07-01RetFiner: A Vision-Language Refinement Scheme for Retinal Foundation Models2025-06-27Boosting Generative Adversarial Transferability with Self-supervised Vision Transformer Features2025-06-26