Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Computer Vision
/
Lipreading
/
LRS2
Lipreading on LRS2
Metric: Word Error Rate (WER) (lower is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
Word Error Rate (WER)
▲
Extra Data
Paper
Date
↕
Code
1
Auto-AVSR
14.6
Yes
Auto-AVSR: Audio-Visual Speech Recognition with ...
2023-03-25
Code
2
USR
15.4
Yes
Unified Speech Recognition: A Single Model for A...
2024-11-04
Code
3
SyncVSR
16.5
Yes
SyncVSR: Data-Efficient Visual Speech Recognitio...
2024-06-18
Code
4
RAVEn Large
18.6
Yes
Jointly Learning Visual and Auditory Speech Repr...
2022-12-12
Code
5
VTP (more data)
22.6
Yes
Sub-word Level Lip Reading With Visual Attention
2021-10-14
-
6
ES³ Large + extLM
24.6
Yes
-
-
-
7
CTC/Attention (LRW+LRS2/3+AVSpeech)
25.5
Yes
Visual Speech Recognition for Multiple Languages...
2022-02-26
Code
8
ES³ Large
26.7
Yes
-
-
-
9
ES³ Base + extLM
28.7
Yes
-
-
-
10
VTP
28.9
Yes
Sub-word Level Lip Reading With Visual Attention
2021-10-14
-
11
SyncVSR
28.9
No
SyncVSR: Data-Efficient Visual Speech Recognitio...
2024-06-18
Code
12
ES³ Base* + extLM
29.3
No
-
-
-
13
ES³ Base
30.7
Yes
-
-
-
14
ES³ Base*
31.4
No
-
-
-
15
CTC/Attention
32.9
No
Visual Speech Recognition for Multiple Languages...
2022-02-26
Code
16
Hybrid CTC / Attention
39.1
No
End-to-end Audio-visual Speech Recognition with ...
2021-02-12
Code
17
MoCo + wav2vec (w/o extLM)
43.2
No
Leveraging Unimodal Self-Supervised Learning for...
2022-02-24
Code
18
Multi-head Visual-Audio Memory
44.5
Yes
Distinguishing Homophenes Using Multi-Head Visua...
2022-04-04
Code
19
TM-seq2seq + extLM
48.3
Yes
Deep Audio-Visual Speech Recognition
2018-09-06
Code
20
LF-MMI TDNN
48.86
Yes
Audio-visual Recognition of Overlapped speech fo...
2020-01-06
-
21
Hybrid CTC / Attention
50
No
Audio-Visual Speech Recognition With A Hybrid CT...
2018-09-28
-
22
Conv-seq2seq
51.7
Yes
-
-
-
23
CTC + KD ASR
53.2
Yes
ASR is all you need: cross-modal distillation fo...
2019-11-28
-
24
TM-CTC + extLM
54.7
Yes
Deep Audio-Visual Speech Recognition
2018-09-06
Code
25
LIBS
65.29
No
Hearing Lips: Improving Lip Reading by Distillin...
2019-11-26
Code
26
SyncVSR
74.6
No
SyncVSR: Data-Efficient Visual Speech Recognitio...
2024-06-18
Code