TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Audio/Speech Recognition/LibriSpeech test-other

Speech Recognition on LibriSpeech test-other

Metric: Word Error Rate (WER) (lower is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Word Error Rate (WER)▲Extra DataPaperDate↕Code
1SAMBA ASR2.48NoSamba-ASR: State-Of-The-Art Speech Recognition L...2025-01-06-
2FAdam2.49NoFAdam: Adam is a natural gradient optimizer usin...2024-05-21Code
3w2v-BERT XXL2.5NoW2v-BERT: Combining Contrastive Learning and Mas...2021-08-07Code
4Conformer + Wav2vec 2.0 + SpecAugment-based Noisy Student Training with Libri-Light2.6NoPushing the Limits of Semi-Supervised Learning f...2020-10-20Code
5HuBERT with Libri-Light2.9NoHuBERT: Self-Supervised Speech Representation Le...2021-06-14Code
6wav2vec 2.0 with Libri-Light3Nowav2vec 2.0: A Framework for Self-Supervised Lea...2020-06-20Code
7Conv + Transformer + wav2vec2.0 + pseudo labeling3.1NoSelf-training and Pre-training are Complementary...2020-10-22Code
8WavLM Large3.2NoWavLM: Large-Scale Self-Supervised Pre-Training ...2021-10-26Code
9SpeechStew (1B)3.3NoSpeechStew: Simply Mix All Available Speech Reco...2021-04-05-
10ContextNet + SpecAugment-based Noisy Student Training with Libri-Light3.4NoImproved Noisy Student Training for Automatic Sp...2020-05-19Code
11E-Branchformer (L) + Internal Language Model Estimation3.65NoE-Branchformer: Branchformer with Enhanced mergi...2022-09-30Code
12data2vec3.7Nodata2vec: A General Framework for Self-supervise...2022-02-07Code
13Conv + Transformer AM + Iterative Pseudo-Labeling (n-gram LM + Transformer Rescoring)3.83NoIterative Pseudo-Labeling for Speech Recognition2020-05-19Code
14Conformer(L)3.9YesConformer: Convolution-augmented Transformer for...2020-05-16Code
15Zipformer+pruned transducer w/ CR-CTC (no external language model)3.95NoCR-CTC: Consistency regularization on CTC for im...2024-10-07Code
16SpeechStew (100M)4NoSpeechStew: Simply Mix All Available Speech Reco...2021-04-05-
17wav2vec 2.04.1Yeswav2vec 2.0: A Framework for Self-Supervised Lea...2020-06-20Code
18ContextNet(L)4.1NoContextNet: Improving Convolutional Neural Netwo...2020-05-07Code
19Conv + Transformer AM (ConvLM with Transformer Rescoring)4.11YesEnd-to-end ASR: from Supervised to Semi-Supervis...2019-11-19Code
20CTC + Transformer LM rescoring4.2YesFaster, Simpler and More Accurate Hybrid ASR Sys...2020-05-19-
21Transformer Transducer4.2YesImproving RNN Transducer Based ASR with Auxiliar...2020-11-05Code
22Qwen-Audio4.2NoQwen-Audio: Advancing Universal Audio Understand...2023-11-14Code
23Conformer(M)4.3YesConformer: Convolution-augmented Transformer for...2020-05-16Code
24Zipformer+CR-CTC (no external language model)4.35NoCR-CTC: Consistency regularization on CTC for im...2024-10-07Code
25Zipformer+pruned transducer (no external language model)4.38NoZipformer: A faster and better encoder for autom...2023-10-17Code
26Multistream CNN with Self-Attentive SRU4.46NoASAPP-ASR: Multistream CNN and Self-Attentive SR...2020-05-21-
27ContextNet(M)4.5YesContextNet: Improving Convolutional Neural Netwo...2020-05-07Code
28hybrid + Transformer LM rescoring4.85YesTransformer-based Acoustic Modeling for Hybrid S...2019-10-22-
29Branchformer + GFSA4.94NoGraph Convolutions Enrich the Self-Attention in ...2023-12-07Code
30Hybrid model with Transformer rescoring5NoRWTH ASR Systems for LibriSpeech: Hybrid vs Atte...2019-05-08Code
31Conformer(S)5YesConformer: Convolution-augmented Transformer for...2020-05-16Code
32Conv + Transformer AM (ConvLM with Transformer Rescoring) (LS only)5.18NoEnd-to-end ASR: from Supervised to Semi-Supervis...2019-11-19Code
33ContextNet(S)5.5YesContextNet: Improving Convolutional Neural Netwo...2020-05-07Code
34LSTM Transducer5.6YesLibrispeech Transducer Model with Internal Langu...2021-04-07Code
35Transformer5.7YesA Comparative Study on Transformer vs RNN in Spe...2019-09-13Code
36LAS + SpecAugment5.8YesSpecAugment: A Simple Data Augmentation Method f...2019-04-18Code
37Multi-Stream Self-Attention With Dilated 1D Convolutions5.8NoState-of-the-Art Speech Recognition Using Multi-...2019-10-01Code
38Squeezeformer (L)5.97NoSqueezeformer: An Efficient Transformer for Auto...2022-06-02Code
39LAS (no LM)6.5YesSpecAugment: A Simple Data Augmentation Method f...2019-04-18Code
40Conformer with Relaxed Attention6.85NoRelaxed Attention: A Simple Method to Boost Perf...2021-07-02Code
41QuartzNet15x57.25No--Code
42tdnn + chain + rnnlm rescoring7.63Yes---
43Jasper DR 10x5 (+ Time/Freq Masks)7.84NoJasper: An End-to-End Convolutional Neural Acous...2019-04-05Code
44Espresso8.7NoEspresso: A Fast End-to-end Neural Speech Recogn...2019-09-18Code
45Jasper DR 10x58.79NoJasper: An End-to-End Convolutional Neural Acous...2019-04-05Code
46MT4SSL9.6NoMT4SSL: Boosting Self-Supervised Speech Represen...2022-11-14Code
47Convolutional Speech Recognition10.47YesFully Convolutional Speech Recognition2018-12-17-
48CTC-CRF 4gram-LM10.65No--Code
49TDNN + pNorm + speed up/down speech12.5No---
50Deep Speech 213.25NoDeep Speech 2: End-to-End Speech Recognition in ...2015-12-08Code
51Local Prior Matching (Large Model, ConvLM LM)15.28NoSemi-Supervised Speech Recognition via Local Pri...2020-02-24Code
52Snips16.5NoSnips Voice Platform: an embedded Spoken Languag...2018-05-25Code
53Local Prior Matching (Large Model)20.84YesSemi-Supervised Speech Recognition via Local Pri...2020-02-24Code