WHAMR!
WHAM! with synthetic reverberated sources
AudioSpeechIntroduced 2019-10-22
WHAMR! is a dataset for noisy and reverberant speech separation. It extends WHAM! by introducing synthetic reverberation to the
speech sources in addition to the existing noise. Room impulse responses were generated and convolved using pyroomacoustics. Reverberation times were chosen to approximate domestic and classroom environments (expected to be similar to the restaurants and coffee shops where the WHAM! noise was collected), and
further classified as high, medium, and low reverberation based on a
qualitative assessment of the mixture’s noise recording.
Benchmarks
Speech Enhancement/PESQSpeech Enhancement/SI-SDRSpeech Enhancement/ΔPESQSpeech Enhancement/SI-SNRSpeech Enhancement/SDRSpeech Enhancement/ESTOISpeech Enhancement/SRMRSpeech Enhancement/SI-SDRiSpeech Separation/SI-SDRiSpeech Separation/MACs (G)Speech Separation/Number of parameters (M)Speech Separation/SDRi