Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/CRNN

CRNN

Reported on 41 benchmarks across 7 tasks · 4 papers · 20 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Speech22 results

DialogueonYouTube News dataset (No Noise)
Accuracy · 2021-10-05
0.967
SOTA
Is Attention always needed? A Case Study on Language Identification from Speech arXiv:2110.03427
DialogueonIndicTTS
Classification Accuracy· 2021-10-05
0.987
SOTA
Is Attention always needed? A Case Study on Language Identification from Speech arXiv:2110.03427
DialogueonYouTube News dataset (White Noise)
Accuracy · 2021-10-05
0.912
SOTA
Is Attention always needed? A Case Study on Language Identification from Speech arXiv:2110.03427
Spoken Language UnderstandingonYouTube News dataset (No Noise)
Accuracy · 2021-10-05
0.967
SOTA
Is Attention always needed? A Case Study on Language Identification from Speech arXiv:2110.03427
Spoken Language UnderstandingonIndicTTS
Classification Accuracy· 2021-10-05
0.987
SOTA
Is Attention always needed? A Case Study on Language Identification from Speech arXiv:2110.03427
Spoken Language UnderstandingonYouTube News dataset (White Noise)
Accuracy · 2021-10-05
0.912
SOTA
Is Attention always needed? A Case Study on Language Identification from Speech arXiv:2110.03427
DialogueonYouTube News dataset (Background Music)
Accuracy · 2017-08-16
0.7
best: 0.89 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
DialogueonYouTube News dataset (Background Music)
F1 Score· 2017-08-16
0.7
best: 0.89 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
DialogueonYouTube News dataset (No Noise)
Accuracy · 2017-08-16
0.91
best: 0.967
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
DialogueonYouTube News dataset (No Noise)
F1 Score· 2017-08-16
0.91
best: 0.96 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
DialogueonYouTube News dataset (Crackling Noise)
Accuracy · 2017-08-16
0.82
best: 0.93 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
DialogueonYouTube News dataset (Crackling Noise)
F1 Score· 2017-08-16
0.83
best: 0.93 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
DialogueonYouTube News dataset (White Noise)
Accuracy · 2017-08-16
0.63
best: 0.912
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
DialogueonYouTube News dataset (White Noise)
F1 Score· 2017-08-16
0.63
best: 0.91 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Spoken Language UnderstandingonYouTube News dataset (Background Music)
Accuracy · 2017-08-16
0.7
best: 0.89 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Spoken Language UnderstandingonYouTube News dataset (Background Music)
F1 Score· 2017-08-16
0.7
best: 0.89 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Spoken Language UnderstandingonYouTube News dataset (No Noise)
Accuracy · 2017-08-16
0.91
best: 0.967
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Spoken Language UnderstandingonYouTube News dataset (No Noise)
F1 Score· 2017-08-16
0.91
best: 0.96 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Spoken Language UnderstandingonYouTube News dataset (Crackling Noise)
Accuracy · 2017-08-16
0.82
best: 0.93 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Spoken Language UnderstandingonYouTube News dataset (Crackling Noise)
F1 Score· 2017-08-16
0.83
best: 0.93 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Spoken Language UnderstandingonYouTube News dataset (White Noise)
Accuracy · 2017-08-16
0.63
best: 0.912
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Spoken Language UnderstandingonYouTube News dataset (White Noise)
F1 Score· 2017-08-16
0.63
best: 0.91 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811

Natural Language Processing11 results

Dialogue UnderstandingonYouTube News dataset (No Noise)
Accuracy · 2021-10-05
0.967
SOTA
Is Attention always needed? A Case Study on Language Identification from Speech arXiv:2110.03427
Dialogue UnderstandingonIndicTTS
Classification Accuracy· 2021-10-05
0.987
SOTA
Is Attention always needed? A Case Study on Language Identification from Speech arXiv:2110.03427
Dialogue UnderstandingonYouTube News dataset (White Noise)
Accuracy · 2021-10-05
0.912
SOTA
Is Attention always needed? A Case Study on Language Identification from Speech arXiv:2110.03427
Dialogue UnderstandingonYouTube News dataset (Background Music)
Accuracy · 2017-08-16
0.7
best: 0.89 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Dialogue UnderstandingonYouTube News dataset (Background Music)
F1 Score· 2017-08-16
0.7
best: 0.89 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Dialogue UnderstandingonYouTube News dataset (No Noise)
Accuracy · 2017-08-16
0.91
best: 0.967
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Dialogue UnderstandingonYouTube News dataset (No Noise)
F1 Score· 2017-08-16
0.91
best: 0.96 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Dialogue UnderstandingonYouTube News dataset (Crackling Noise)
Accuracy · 2017-08-16
0.82
best: 0.93 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Dialogue UnderstandingonYouTube News dataset (Crackling Noise)
F1 Score· 2017-08-16
0.83
best: 0.93 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Dialogue UnderstandingonYouTube News dataset (White Noise)
Accuracy · 2017-08-16
0.63
best: 0.912
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811
Dialogue UnderstandingonYouTube News dataset (White Noise)
F1 Score· 2017-08-16
0.63
best: 0.91 (Inception-v3 CRNN)
Language Identification Using Deep Convolutional Recurrent Neural Networks arXiv:1708.04811

Audio8 results

Sound Event DetectiononWildDESED
PSDS1 (10dB)· 2024-07-04
0.222
best: 0.356 (CRNN (with BEATs + Separation))
SOTA
WildDESED: An LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection System arXiv:2407.03656
Sound Event DetectiononWildDESED
PSDS1 (Clean)· 2024-07-04
0.348
best: 0.5 (CRNN (with BEATs))
SOTA
WildDESED: An LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection System arXiv:2407.03656
2D Semantic SegmentationonSVT
Accuracy· 2015-07-21
80.8
best: 99.1 (CLIP4STR-H (DFN-5B))
SOTA
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition arXiv:1507.05717
2D Semantic SegmentationonICDAR 2003
Accuracy· 2015-07-21
89.4
best: 97.1 (Yet Another Text Recognizer)
SOTA
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition arXiv:1507.05717
2D Semantic SegmentationonICDAR2013
Accuracy· 2015-07-21
86.7
best: 99.42 (CLIP4STR-L*)
SOTA
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition arXiv:1507.05717
Sound Event DetectiononWildDESED
PSDS1 (-5dB)· 2024-07-04
0.017
best: 0.134 (CRNN (with BEATs + Separation))
WildDESED: An LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection System arXiv:2407.03656
Sound Event DetectiononWildDESED
PSDS1 (0dB)· 2024-07-04
0.064
best: 0.219 (CRNN (with BEATs + Separation))
WildDESED: An LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection System arXiv:2407.03656
Sound Event DetectiononWildDESED
PSDS1 (5dB)· 2024-07-04
0.148
best: 0.291 (CRNN (with BEATs + Separation))
WildDESED: An LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection System arXiv:2407.03656

Computer Vision6 results

Scene ParsingonSVT
Accuracy· 2015-07-21
80.8
best: 99.1 (CLIP4STR-H (DFN-5B))
SOTA
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition arXiv:1507.05717
Scene ParsingonICDAR 2003
Accuracy· 2015-07-21
89.4
best: 97.1 (Yet Another Text Recognizer)
SOTA
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition arXiv:1507.05717
Scene ParsingonICDAR2013
Accuracy· 2015-07-21
86.7
best: 99.42 (CLIP4STR-L*)
SOTA
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition arXiv:1507.05717
Scene Text RecognitiononSVT
Accuracy· 2015-07-21
80.8
best: 99.1 (CLIP4STR-H (DFN-5B))
SOTA
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition arXiv:1507.05717
Scene Text RecognitiononICDAR 2003
Accuracy· 2015-07-21
89.4
best: 97.1 (Yet Another Text Recognizer)
SOTA
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition arXiv:1507.05717
Scene Text RecognitiononICDAR2013
Accuracy· 2015-07-21
86.7
best: 99.42 (CLIP4STR-L*)
SOTA
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition arXiv:1507.05717