High Fidelity Speech Enhancement with Band-split RNN

Jianwei Yu, Yi Luo, Hangting Chen, Rongzhi Gu, Chao Weng

2022-12-01Vocal Bursts Intensity Prediction Speech Enhancement

Abstract

Despite the rapid progress in speech enhancement (SE) research, enhancing the quality of desired speech in environments with strong noise and interfering speakers remains challenging. In this paper, we extend the application of the recently proposed band-split RNN (BSRNN) model to full-band SE and personalized SE (PSE) tasks. To mitigate the effects of unstable high-frequency components in full-band speech, we perform bi-directional and uni-directional band-level modeling to low-frequency and high-frequency subbands, respectively. For PSE task, we incorporate a speaker enrollment module into BSRNN to utilize target speaker information. Moreover, we utilize a MetricGAN discriminator (MGD) and a multi-resolution spectrogram discriminator (MRSD) to improve perceptual quality metrics. Experimental results show that our system outperforms various top-ranking SE systems, achieves state-of-the-art (SOTA) results on the DNS-2020 test set and ranks among the top 3 in the DNS-2023 challenge.

Results

Task	Dataset	Metric	Value	Model
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	PESQ-NB	3.89	BSRNN-S + MRSD
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	PESQ-WB	3.53	BSRNN-S + MRSD
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	SI-SDR-WB	21.4	BSRNN-S + MRSD
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	STOI	98.4	BSRNN-S + MRSD
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	PESQ-NB	3.87	BSRNN-16k
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	PESQ-WB	3.45	BSRNN-16k
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	SI-SDR-WB	21.1	BSRNN-16k
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	STOI	98.3	BSRNN-16k
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	PESQ-WB	3.42	BSRNN-S
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	SI-SDR-WB	21.3	BSRNN-S
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	PESQ-NB	3.79	BSRNN
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	PESQ-WB	3.32	BSRNN
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	STOI	98	BSRNN
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	PESQ-NB	3.85	BSRNN-S + MGD
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	SI-SDR-WB	21.4	BSRNN-S + MGD
Speech Enhancement	Deep Noise Suppression (DNS) Challenge	STOI	98.4	BSRNN-S + MGD

High Fidelity Speech Enhancement with Band-split RNN

Abstract

Results

Related Papers

High Fidelity Speech Enhancement with Band-split RNN

Abstract

Results

Related Papers