Description
WaveGlow is a flow-based generative model that generates audio by sampling from a distribution. Specifically samples are taken from a zero mean spherical Gaussian with the same number of dimensions as our desired output, and those samples are put through a series of layers that transforms the simple distribution to one which has the desired distribution.
Papers Using This Method
Enhancing Kurdish Text-to-Speech with Native Corpus Training: A High-Quality WaveGlow Vocoder Approach2024-09-10Rapid Speaker Adaptation in Low Resource Text to Speech Systems using Synthetic Data and Transfer learning2023-12-02Code-Mixed Text to Speech Synthesis under Low-Resource Constraints2023-12-02Affective social anthropomorphic intelligent system2023-04-19Adaptive re-calibration of channel-wise features for Adversarial Audio Classification2022-10-21NatiQ: An End-to-end Text-to-Speech System for Arabic2022-06-15FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis2021-09-27Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue Imaging2021-07-26A Flow-Based Neural Network for Time Domain Speech Enhancement2021-06-16Low Bit-Rate Wideband Speech Coding: A Deep Generative Model based Approach2021-02-04Text-to-speech for the hearing impaired2020-12-03MelGlow: Efficient Waveform Generative Network Based on Location-Variable Convolution2020-12-03StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization2020-11-03A comparison of Vietnamese Statistical Parametric Speech Synthesis Systems2020-05-26Probing the phonetic and phonological knowledge of tones in Mandarin TTS models2019-12-23WaveFlow: A Compact Flow-based Model for Raw Audio2019-12-03Speaker independence of neural vocoders and their effect on parametric resynthesis speech enhancement2019-11-14Transferring neural speech waveform synthesizers to musical instrument sounds generation2019-10-27Parametric Resynthesis with neural vocoders2019-06-16WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation2019-04-05