Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/Lip2Wav

Lip2Wav

Reported on 75 benchmarks across 4 tasks · 1 paper · 75 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Audio24 results

Speech RecognitiononLRW
ESTOI· 2020-05-17
0.344
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLRW
PESQ· 2020-05-17
1.197
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLRW
STOI· 2020-05-17
0.543
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (EH)
ESTOI· 2020-05-17
0.22
best: 0.304 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (EH)
PESQ· 2020-05-17
1.367
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (EH)
STOI· 2020-05-17
0.369
best: 0.463 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (Chess)
ESTOI· 2020-05-17
0.29
best: 0.334 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (Chess)
PESQ· 2020-05-17
1.4
best: 1.503 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (Chess)
STOI· 2020-05-17
0.418
best: 0.506 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (DL)
ESTOI· 2020-05-17
0.183
best: 0.402 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (DL)
PESQ· 2020-05-17
1.671
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (DL)
STOI· 2020-05-17
0.282
best: 0.576 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (HS)
ESTOI· 2020-05-17
0.311
best: 0.337 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (HS)
PESQ· 2020-05-17
1.29
best: 1.366 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (HS)
STOI· 2020-05-17
0.446
best: 0.504 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (Chem)
ESTOI· 2020-05-17
0.284
best: 0.429 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (Chem)
PESQ· 2020-05-17
1.3
best: 1.529 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononLip2Wav (Chem)
STOI· 2020-05-17
0.416
best: 0.566 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononTCD-TIMIT corpus (mixed-speech)
ESTOI· 2020-05-17
36.5
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononTCD-TIMIT corpus (mixed-speech)
PESQ· 2020-05-17
1.35
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononTCD-TIMIT corpus (mixed-speech)
STOI· 2020-05-17
0.558
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononGRID corpus (mixed-speech)
ESTOI· 2020-05-17
0.535
best: 0.579 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononGRID corpus (mixed-speech)
PESQ· 2020-05-17
1.772
best: 1.984 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Speech RecognitiononGRID corpus (mixed-speech)
STOI· 2020-05-17
0.731
best: 0.738 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209

Speech24 results

Visual Speech RecognitiononLRW
ESTOI· 2020-05-17
0.344
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLRW
PESQ· 2020-05-17
1.197
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLRW
STOI· 2020-05-17
0.543
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (EH)
ESTOI· 2020-05-17
0.22
best: 0.304 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (EH)
PESQ· 2020-05-17
1.367
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (EH)
STOI· 2020-05-17
0.369
best: 0.463 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (Chess)
ESTOI· 2020-05-17
0.29
best: 0.334 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (Chess)
PESQ· 2020-05-17
1.4
best: 1.503 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (Chess)
STOI· 2020-05-17
0.418
best: 0.506 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (DL)
ESTOI· 2020-05-17
0.183
best: 0.402 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (DL)
PESQ· 2020-05-17
1.671
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (DL)
STOI· 2020-05-17
0.282
best: 0.576 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (HS)
ESTOI· 2020-05-17
0.311
best: 0.337 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (HS)
PESQ· 2020-05-17
1.29
best: 1.366 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (HS)
STOI· 2020-05-17
0.446
best: 0.504 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (Chem)
ESTOI· 2020-05-17
0.284
best: 0.429 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (Chem)
PESQ· 2020-05-17
1.3
best: 1.529 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononLip2Wav (Chem)
STOI· 2020-05-17
0.416
best: 0.566 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononTCD-TIMIT corpus (mixed-speech)
ESTOI· 2020-05-17
36.5
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononTCD-TIMIT corpus (mixed-speech)
PESQ· 2020-05-17
1.35
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononTCD-TIMIT corpus (mixed-speech)
STOI· 2020-05-17
0.558
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononGRID corpus (mixed-speech)
ESTOI· 2020-05-17
0.535
best: 0.579 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononGRID corpus (mixed-speech)
PESQ· 2020-05-17
1.772
best: 1.984 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Visual Speech RecognitiononGRID corpus (mixed-speech)
STOI· 2020-05-17
0.731
best: 0.738 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209

Computer Vision24 results

Lip to Speech SynthesisonLRW
ESTOI· 2020-05-17
0.344
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLRW
PESQ· 2020-05-17
1.197
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLRW
STOI· 2020-05-17
0.543
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (EH)
ESTOI· 2020-05-17
0.22
best: 0.304 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (EH)
PESQ· 2020-05-17
1.367
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (EH)
STOI· 2020-05-17
0.369
best: 0.463 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (Chess)
ESTOI· 2020-05-17
0.29
best: 0.334 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (Chess)
PESQ· 2020-05-17
1.4
best: 1.503 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (Chess)
STOI· 2020-05-17
0.418
best: 0.506 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (DL)
ESTOI· 2020-05-17
0.183
best: 0.402 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (DL)
PESQ· 2020-05-17
1.671
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (DL)
STOI· 2020-05-17
0.282
best: 0.576 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (HS)
ESTOI· 2020-05-17
0.311
best: 0.337 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (HS)
PESQ· 2020-05-17
1.29
best: 1.366 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (HS)
STOI· 2020-05-17
0.446
best: 0.504 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (Chem)
ESTOI· 2020-05-17
0.284
best: 0.429 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (Chem)
PESQ· 2020-05-17
1.3
best: 1.529 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonLip2Wav (Chem)
STOI· 2020-05-17
0.416
best: 0.566 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonTCD-TIMIT corpus (mixed-speech)
ESTOI· 2020-05-17
36.5
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonTCD-TIMIT corpus (mixed-speech)
PESQ· 2020-05-17
1.35
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonTCD-TIMIT corpus (mixed-speech)
STOI· 2020-05-17
0.558
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonGRID corpus (mixed-speech)
ESTOI· 2020-05-17
0.535
best: 0.579 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonGRID corpus (mixed-speech)
PESQ· 2020-05-17
1.772
best: 1.984 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip to Speech SynthesisonGRID corpus (mixed-speech)
STOI· 2020-05-17
0.731
best: 0.738 (Visual Voice Memory)
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209

Time Series3 results

Lip ReadingonTCD-TIMIT corpus (mixed-speech)
WER· 2020-05-17
31.26
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip ReadingonGRID corpus (mixed-speech)
WER· 2020-05-17
14.08
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209
Lip ReadingonLRW
WER· 2020-05-17
34.2
SOTA
Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis arXiv:2005.08209