Zipformer+pruned transducer (no external language model)

Reported on 3 benchmarks across 1 task · 2 papers · 2 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Audio3 results

Speech RecognitiononGigaSpeech DEV
Word Error Rate (WER)· 2024-10-07
10.09
best: 9.12 (SAMBA ASR)
SOTA
CR-CTC: Consistency regularization on CTC for improved speech recognition arXiv:2410.05101
Speech RecognitiononGigaSpeech TEST
Word Error Rate (WER)· 2024-10-07
10.2
best: 10.03 (Zipformer+pruned transducer w/ CR-CTC (no external language model))
SOTA
CR-CTC: Consistency regularization on CTC for improved speech recognition arXiv:2410.05101
Speech RecognitiononLibriSpeech test-other
Word Error Rate (WER)· 2023-10-17
4.38
best: 2.48 (SAMBA ASR)
Zipformer: A faster and better encoder for automatic speech recognition arXiv:2310.11230