Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/Whisper v2 +lang

Whisper v2 +lang

Reported on 23 benchmarks across 1 task · 1 paper · 8 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Audio23 results

Speech RecognitiononJam-ALT
Case-Sensitive Word Error Rate· 2024-07-30
32.6
best: 20.1 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT French
Case-Sensitive Word Error Rate· 2024-07-30
30.5
best: 23.5 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT French
Word Error Rate (WER)· 2024-07-30
27.1
best: 20.8 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT Spanish
Case-Sensitive Word Error Rate· 2024-07-30
27.7
best: 17.7 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT Spanish
Word Error Rate (WER)· 2024-07-30
21.9
best: 12.6 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT German
Case-Sensitive Word Error Rate· 2024-07-30
26
best: 17.5 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT German
Word Error Rate (WER)· 2024-07-30
19.9
best: 12.6 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT English
Case-Sensitive Word Error Rate· 2024-07-30
43.7
best: 20.9 (AudioShake v3)
SOTA
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT
Line break F1· 2024-07-30
70.4
best: 84.4 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT
Punctuation F1· 2024-07-30
45
best: 57 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT
Section break F1· 2024-07-30
3.7
best: 73.9 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT
Word Error Rate (WER)· 2024-07-30
27.9
best: 16.1 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT French
Line break F-1· 2024-07-30
73.7
best: 88.6 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT French
Punctuation F-1· 2024-07-30
45.3
best: 46.1 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT Spanish
Line break F-1· 2024-07-30
71.5
best: 82.7 (AudioShake v1)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT Spanish
Punctuation F-1· 2024-07-30
52.5
best: 56.7 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT Spanish
Section break F-1· 2024-07-30
3.1
best: 69.6 (AudioShake v1)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT German
Line break F-1· 2024-07-30
71.7
best: 83.7 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT German
Punctuation F-1· 2024-07-30
48.4
best: 57.1 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT English
Line break F-1· 2024-07-30
65.5
best: 84.3 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT English
Punctuation F-1· 2024-07-30
34.9
best: 65.3 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT English
Section break F-1· 2024-07-30
11.6
best: 84.8 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370
Speech RecognitiononJam-ALT English
Word Error Rate (WER)· 2024-07-30
39.7
best: 17.3 (AudioShake v3)
Lyrics Transcription for Humans: A Readability-Aware Benchmark arXiv:2408.06370