Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Audio
/
Speech Recognition
/
swb_hub_500 WER fullSWBCH
Speech Recognition on swb_hub_500 WER fullSWBCH
Metric: Percentage error (lower is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
Percentage error (best first)
Percentage error (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Percentage error
▲
Extra Data
Paper
Date
↕
Code
1
IBM (LSTM+Conformer encoder-decoder)
6.8
No
On the limit of English conversational speech re...
2021-05-03
-
2
IBM (LSTM encoder-decoder)
7.8
No
Single headed attention based sequence-to-sequen...
2020-01-20
-
3
ResNet + BiLSTMs acoustic model
10.3
No
English Conversational Telephone Speech Recognit...
2017-03-06
-
4
VGG/Resnet/LACE/BiLSTM acoustic model trained on SWB+Fisher+CH, N-gram + RNNLM language model trained on Switchboard+Fisher+Gigaword+Broadcast
11.9
No
The Microsoft 2016 Conversational Speech Recogni...
2016-09-12
-
5
RNN + VGG + LSTM acoustic model trained on SWB+Fisher+CH, N-gram + "model M" + NNLM language model
12.2
No
The IBM 2016 English Conversational Telephone Sp...
2016-04-27
-
6
HMM-BLSTM trained with MMI + data augmentation (speed) + iVectors + 3 regularizations + Fisher
13
No
-
-
-
7
HMM-TDNN trained with MMI + data augmentation (speed) + iVectors + 3 regularizations + Fisher (10% / 15.1% respectively trained on SWBD only)
13.3
No
-
-
-
8
CNN + Bi-RNN + CTC (speech to letters), 25.9% WER if trainedonlyon SWB
16
No
Deep Speech: Scaling up end-to-end speech recogn...
2014-12-17
Code
9
HMM-TDNN + iVectors
17.1
No
-
-
-
10
HMM-DNN +sMBR
18.4
No
-
-
-
11
DNN + Dropout
19.1
No
Building DNN Acoustic Models for Large Vocabular...
2014-06-30
Code
12
HMM-TDNN + pNorm + speed up/down speech
19.3
No
-
-
-
#1
IBM (LSTM+Conformer encoder-decoder)
SOTA
6.8
Percentage error
· 2021-05-03
On the limit of English conversational speech recognition
#2
IBM (LSTM encoder-decoder)
SOTA
7.8
Percentage error
· 2020-01-20
Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard
#3
ResNet + BiLSTMs acoustic model
SOTA
10.3
Percentage error
· 2017-03-06
English Conversational Telephone Speech Recognition by Humans and Machines
#4
VGG/Resnet/LACE/BiLSTM acoustic model trained on SWB+Fisher+CH, N-gram + RNNLM language model trained on Switchboard+Fisher+Gigaword+Broadcast
SOTA
11.9
Percentage error
· 2016-09-12
The Microsoft 2016 Conversational Speech Recognition System
#5
RNN + VGG + LSTM acoustic model trained on SWB+Fisher+CH, N-gram + "model M" + NNLM language model
SOTA
12.2
Percentage error
· 2016-04-27
The IBM 2016 English Conversational Telephone Speech Recognition System
#6
HMM-BLSTM trained with MMI + data augmentation (speed) + iVectors + 3 regularizations + Fisher
13
Percentage error
No paper
#7
HMM-TDNN trained with MMI + data augmentation (speed) + iVectors + 3 regularizations + Fisher (10% / 15.1% respectively trained on SWBD only)
13.3
Percentage error
No paper
#8
CNN + Bi-RNN + CTC (speech to letters), 25.9% WER if trainedonlyon SWB
SOTA
16
Percentage error
· 2014-12-17
Deep Speech: Scaling up end-to-end speech recognition
Code
#9
HMM-TDNN + iVectors
17.1
Percentage error
No paper
#10
HMM-DNN +sMBR
18.4
Percentage error
No paper
#11
DNN + Dropout
SOTA
19.1
Percentage error
· 2014-06-30
Building DNN Acoustic Models for Large Vocabulary Speech Recognition
Code
#12
HMM-TDNN + pNorm + speed up/down speech
19.3
Percentage error
No paper