Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Audio
/
Speech Recognition
/
AISHELL-1
Speech Recognition on AISHELL-1
Metric: Word Error Rate (WER) (lower is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
Sort:
Word Error Rate (WER) (best first)
Word Error Rate (WER) (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Word Error Rate (WER)
▲
Extra Data
Paper
Date
↕
Code
1
FireRedASR-AED
0.55
Yes
FireRedASR: Open-Source Industrial-Grade Mandari...
2025-01-24
Code
2
Seed-ASR
0.68
Yes
Seed-ASR: Understanding Diverse Speech and Conte...
2024-07-05
-
3
Qwen-Audio
1.29
Yes
Qwen-Audio: Advancing Universal Audio Understand...
2023-11-14
Code
4
MMSpeech With LM
1.9
No
MMSpeech: Multi-modal Multi-task Encoder-Decoder...
2022-11-29
Code
5
Paraformer-large
1.95
Yes
FunASR: A Fundamental End-to-End Speech Recognit...
2023-05-18
Code
6
Zipformer+CR-CTC (no external language model)
4.02
No
CR-CTC: Consistency regularization on CTC for im...
2024-10-07
Code
7
Lightweight Transducer With LM
4.03
No
Lightweight Transducer Based on Frame-Level Crit...
2024-09-05
Code
8
SE-WSBO With LM
4.1
No
Improving Mandarin Speech Recogntion with Block-...
2022-07-24
Code
9
CIF-HKD With LM
4.1
No
Knowledge Transfer from Pre-trained Language Mod...
2023-01-30
Code
10
Lightweight Transducer
4.31
No
Lightweight Transducer Based on Frame-Level Crit...
2024-09-05
Code
11
UMA
4.7
No
Unimodal Aggregation for CTC-based Speech Recogn...
2023-09-15
Code
12
U2
4.72
No
Unified Streaming and Non-streaming Two-pass End...
2020-12-10
Code
13
Paraformer
4.95
No
FunASR: A Fundamental End-to-End Speech Recognit...
2023-05-18
Code
14
BAT
4.97
No
BAT: Boundary aware transducer for memory-effici...
2023-05-19
Code
15
CTC-CRF 4gram-LM
6.34
No
CAT: A CTC-CRF based ASR Toolkit Bridging the Hy...
2020-05-27
Code
16
BRA-E
6.63
No
Beyond Universal Transformer: block reusing with...
2023-03-23
-
17
CTC/Att
6.7
No
A Comparative Study on Transformer vs RNN in Spe...
2019-09-13
Code
18
Att
18.7
No
End-to-end Speech Recognition with Adaptive Comp...
2018-08-30
-
#1
FireRedASR-AED
SOTA
0.55
Word Error Rate (WER)
· Extra Data
· 2025-01-24
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration
Code
#2
Seed-ASR
SOTA
0.68
Word Error Rate (WER)
· Extra Data
· 2024-07-05
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
#3
Qwen-Audio
SOTA
1.29
Word Error Rate (WER)
· Extra Data
· 2023-11-14
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Code
#4
MMSpeech With LM
SOTA
1.9
Word Error Rate (WER)
· 2022-11-29
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition
Code
#5
Paraformer-large
1.95
Word Error Rate (WER)
· Extra Data
· 2023-05-18
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Code
#6
Zipformer+CR-CTC (no external language model)
4.02
Word Error Rate (WER)
· 2024-10-07
CR-CTC: Consistency regularization on CTC for improved speech recognition
Code
#7
Lightweight Transducer With LM
4.03
Word Error Rate (WER)
· 2024-09-05
Lightweight Transducer Based on Frame-Level Criterion
Code
#8
SE-WSBO With LM
SOTA
4.1
Word Error Rate (WER)
· 2022-07-24
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Code
#9
CIF-HKD With LM
4.1
Word Error Rate (WER)
· 2023-01-30
Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation
Code
#10
Lightweight Transducer
4.31
Word Error Rate (WER)
· 2024-09-05
Lightweight Transducer Based on Frame-Level Criterion
Code
#11
UMA
4.7
Word Error Rate (WER)
· 2023-09-15
Unimodal Aggregation for CTC-based Speech Recognition
Code
#12
U2
SOTA
4.72
Word Error Rate (WER)
· 2020-12-10
Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition
Code
#13
Paraformer
4.95
Word Error Rate (WER)
· 2023-05-18
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Code
#14
BAT
4.97
Word Error Rate (WER)
· 2023-05-19
BAT: Boundary aware transducer for memory-efficient and low-latency ASR
Code
#15
CTC-CRF 4gram-LM
SOTA
6.34
Word Error Rate (WER)
· 2020-05-27
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency
Code
#16
BRA-E
6.63
Word Error Rate (WER)
· 2023-03-23
Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition
#17
CTC/Att
SOTA
6.7
Word Error Rate (WER)
· 2019-09-13
A Comparative Study on Transformer vs RNN in Speech Applications
Code
#18
Att
SOTA
18.7
Word Error Rate (WER)
· 2018-08-30
End-to-end Speech Recognition with Adaptive Computation Steps