TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/LibriSpeech

LibriSpeech

AudioSpeechCC BY 4.0Introduced 2015-01-01

The LibriSpeech corpus is a collection of approximately 1,000 hours of audiobooks that are a part of the LibriVox project. Most of the audiobooks come from the Project Gutenberg. The training data is split into 3 partitions of 100hr, 360hr, and 500hr sets while the dev and test data are split into the ’clean’ and ’other’ categories, respectively, depending upon how well or challenging Automatic Speech Recognition systems would perform against. Each of the dev and test sets is around 5hr in audio length. This corpus also provides the n-gram language models and the corresponding texts excerpted from the Project Gutenberg books, which contain 803M tokens and 977K unique words.

Source: State-of-the-art Speech Recognition using Multi-stream Self-attention with Dilated 1D Convolutions

Related Benchmarks

LibriSpeech 100h test-clean/Speech Recognition/Word Error Rate (WER)LibriSpeech 100h test-other/Speech Recognition/Word Error Rate (WER)LibriSpeech test-clean/1 Image, 2*2 Stitchi/Character Error Rate (CER)LibriSpeech test-clean/1 Image, 2*2 Stitchi/Equal Error RateLibriSpeech test-clean/1 Image, 2*2 Stitchi/Word Error Rate (WER)LibriSpeech test-clean/2D Classification/Character Error Rate (CER)LibriSpeech test-clean/2D Classification/Equal Error RateLibriSpeech test-clean/2D Classification/Word Error Rate (WER)LibriSpeech test-clean/Speech Recognition/Word Error Rate (WER)LibriSpeech test-clean/Voice Conversion/Character Error Rate (CER)LibriSpeech test-clean/Voice Conversion/Equal Error RateLibriSpeech test-clean/Voice Conversion/Word Error Rate (WER)LibriSpeech test-other/Speech Recognition/Word Error Rate (WER)LibriSpeech train-clean-100 test-clean/Speech Recognition/Word Error Rate (WER)LibriSpeech train-clean-100 test-other/Speech Recognition/Word Error Rate (WER)LibriSpeechDuplicate/Speech Enhancement/Audio Quality MOS

Statistics

Papers
2,361
Benchmarks
0

Links

Homepage

Tasks

Automatic Speech RecognitionResynthesisSpeech RecognitionVoice Conversion