SpeechInstruct
SpeechTextsIntroduced 2023-05-18
SpeechInstruct is a large-scale cross-modal speech instruction dataset. It contains 37,969 quadruplets composed of speech instructions, text instructions, text responses, and speech responses.
Source: SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities
Image Source: SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities