Exploiting Adapters for Cross-lingual Low-resource Speech Recognition

Wenxin Hou, Han Zhu, Yidong Wang, Jindong Wang, Tao Qin, Renjun Xu, Takahiro Shinozaki

2021-05-18Speech Recognition Meta-Learning speech-recognition Cross-Lingual ASR General Knowledge

Abstract

Cross-lingual speech adaptation aims to solve the problem of leveraging multiple rich-resource languages to build models for a low-resource target language. Since the low-resource language has limited training data, speech recognition models can easily overfit. In this paper, we propose to use adapters to investigate the performance of multiple adapters for parameter-efficient cross-lingual speech adaptation. Based on our previous MetaAdapter that implicitly leverages adapters, we propose a novel algorithms called SimAdapter for explicitly learning knowledge from adapters. Our algorithm leverages adapters which can be easily integrated into the Transformer structure.MetaAdapter leverages meta-learning to transfer the general knowledge from training data to the test language. SimAdapter aims to learn the similarities between the source and target languages during fine-tuning using the adapters. We conduct extensive experiments on five-low-resource languages in Common Voice dataset. Results demonstrate that our MetaAdapter and SimAdapter methods can reduce WER by 2.98% and 2.55% with only 2.5% and 15.5% of trainable parameters compared to the strong full-model fine-tuning baseline. Moreover, we also show that these two novel algorithms can be integrated for better performance with up to 3.55% relative WER reduction.

Related Papers

Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17 NonverbalTTS: A Public English Corpus of Text-Aligned Nonverbal Vocalizations with Emotion Annotations for Text-to-Speech2025-07-17 Are encoders able to learn landmarkers for warm-starting of Hyperparameter Optimization?2025-07-16 Imbalanced Regression Pipeline Recommendation2025-07-16 CLID-MU: Cross-Layer Information Divergence Based Meta Update Strategy for Learning with Noisy Labels2025-07-16 PROL : Rehearsal Free Continual Learning in Streaming Data via Prompt Online Learning2025-07-16 Mixture of Experts in Large Language Models2025-07-15 WhisperKit: On-device Real-time ASR with Billion-Scale Transformers2025-07-14