TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Natural Language Transduction/Lip Reading in the Wild

Natural Language Transduction on Lip Reading in the Wild

Metric: Top-1 Accuracy (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Top-1 Accuracy▼Extra DataPaperDate↕Code
1SyncVSR (Word Boundary)95NoSyncVSR: Data-Efficient Visual Speech Recognitio...2024-06-18Code
23D Conv + ResNet-18 + DC-TCN + KD (Ensemble & Word Boundary)94.1YesTraining Strategies for Improved Lip-reading2022-09-03Code
3SyncVSR93.2NoSyncVSR: Data-Efficient Visual Speech Recognitio...2024-06-18Code
4AVCRFormer89.57No--Code
53D Conv + EfficientNetV2 + Transformer + TCN89.52No---
6Vosk + MediaPipe + LS + MixUp + SA + 3DResNet-18 + BiLSTM + Cosine WR88.7No---
73D Conv + ResNet-18 + MS-TCN + Multi-Head Visual-Audio Memory88.5NoDistinguishing Homophenes Using Multi-Head Visua...2022-04-04Code
83D Conv + ResNet-18 + MS-TCN + KD (Ensemble)88.5NoTowards Practical Lipreading with Distilled and ...2020-07-13Code
93D-ResNet + Bi-GRU + MixUp + Label Smoothing + Cosine LR (Word Boundary)88.4NoLearn an Effective Lip Reading Model without Pains2020-11-15Code
103D-ResNet + Bi-GRU + MixUp + Label Smoothing + Cosine LR85.5NoLearn an Effective Lip Reading Model without Pains2020-11-15Code
113D Conv + ResNet-18 + Bi-GRU + Visual-Audio Memory85.4NoMulti-modality Associative Bridging through Memo...2022-04-04Code
123D Conv + ResNet-18 + MS-TCN85.3NoLipreading using Temporal Convolutional Networks2020-01-23Code
133D Conv + ResNet-18 + Bi-GRU(Face Cutout)85.02NoCan We Read Speech Beyond the Lips? Rethinking R...2020-03-06Code
14MoCo + Wav2Vec by SJTU LUMIA85NoLeveraging Unimodal Self-Supervised Learning for...2022-02-24Code
153D Conv + P3D-ResNet50 + TCN84.8NoDiscriminative Multi-modality Speech Recognition2020-05-12Code
163D Conv + ResNet-18 + Bi-GRU84.41NoMutual Information Maximization for Effective Li...2020-03-13Code
17SpotFast + Transformer + Product-Key memory84.4NoSpotFast Networks with Memory Augmented Lateral ...2020-05-21Code
18DFTN84.13NoDeformation Flow Based Two-Stream Network for Li...2020-03-12Code
19PCPG83.5NoPseudo-Convolutional Policy Gradient for Sequenc...2020-03-09-
203D Conv + ResNet-34 + Bi-GRU83.39NoEnd-to-end Audiovisual Speech Recognition2018-02-18Code
21Multi-grained + Bi-ConvLSTM83.34NoMulti-Grained Spatio-temporal Modeling for Lip-r...2019-08-30-
223D Conv + ResNet-34 + Bi-LSTM83NoCombining Residual Networks with LSTMs for Lipre...2017-03-12Code