PromptSpeech

SpeechIntroduced 2022-11-22

PromptSpeech is a dataset that consists of speech and the corresponding prompts. We synthesize speech with 5 different style factors (gender, pitch, speaking speed, volume, and emotion) from a commercial TTS API. The emotion factor has 5 categories and the gender factor has 2 categories.

Source: PromptTTS: Controllable Text-to-Speech with Text Descriptions

Image Source: https://arxiv.org/pdf/2211.12171v1.pdf