Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/OWSM v3.1 +lang

OWSM v3.1 +lang

Reported on 24 benchmarks across 1 task · 1 paper · 5 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Audio24 results

Speech RecognitiononJam-ALT
Case-Sensitive Word Error Rate· 2024-07-30
75
best: 20.1 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT French
Case-Sensitive Word Error Rate· 2024-07-30
75.7
best: 23.5 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT Spanish
Case-Sensitive Word Error Rate· 2024-07-30
78.5
best: 17.7 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT German
Case-Sensitive Word Error Rate· 2024-07-30
71.8
best: 17.5 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT English
Case-Sensitive Word Error Rate· 2024-07-30
74
best: 20.9 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT
Line break F1· 2024-07-30
37.8
best: 84.4 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT
Parenthesis F-1· 2024-07-30
0.6
best: 29.4 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT
Punctuation F1· 2024-07-30
22.5
best: 57 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT
Word Error Rate (WER)· 2024-07-30
69.3
best: 16.1 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT French
Line break F-1· 2024-07-30
36
best: 88.6 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT French
Parenthesis F-1· 2024-07-30
1.9
best: 41.3 (AudioShake v1)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT French
Punctuation F-1· 2024-07-30
30.6
best: 46.1 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT French
Word Error Rate (WER)· 2024-07-30
71.6
best: 20.8 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT Spanish
Line break F-1· 2024-07-30
30.2
best: 82.7 (AudioShake v1)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT Spanish
Punctuation F-1· 2024-07-30
8.8
best: 56.7 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT Spanish
Word Error Rate (WER)· 2024-07-30
73.3
best: 12.6 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT German
Line break F-1· 2024-07-30
40.7
best: 83.7 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT German
Punctuation F-1· 2024-07-30
28.6
best: 57.1 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT German
Word Error Rate (WER)· 2024-07-30
63.3
best: 12.6 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT English
Line break F-1· 2024-07-30
42.7
best: 84.3 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT English
Punctuation F-1· 2024-07-30
22.3
best: 65.3 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT English
Word Error Rate (WER)· 2024-07-30
68.6
best: 17.3 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT Spanish
Parenthesis F-1
0
best: 38 (AudioShake v1)
Speech RecognitiononJam-ALT German
Parenthesis F-1
0
best: 76.6 (AudioShake v3)