TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Audio/Speech Recognition/LRS3-TED

Speech Recognition on LRS3-TED

Metric: Word Error Rate (WER) (lower is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Word Error Rate (WER)▲Extra DataPaperDate↕Code
1Whisper0.68YesWhisper-Flamingo: Integrating Visual Features in...2024-06-14Code
2Llama-AVSR0.81YesLarge Language Models are Strong Audio-Visual Sp...2024-09-18Code
3CTC/Attention1NoAuto-AVSR: Audio-Visual Speech Recognition with ...2023-03-25Code
4AV-HuBERT Large1.3YesLearning Audio-Visual Speech Representation by M...2022-01-05Code
5RAVEn Large1.4YesJointly Learning Visual and Auditory Speech Repr...2022-12-12Code
6CTC/Attention19.1YesAuto-AVSR: Audio-Visual Speech Recognition with ...2023-03-25Code
7VTP with more data30.7YesSub-word Level Lip Reading With Visual Attention2021-10-14-
8VTP40.6YesSub-word Level Lip Reading With Visual Attention2021-10-14-