Watch Your Mouth: Point Clouds based Speech Recognition Dataset

Point cloudSpeechVideosIntroduced 2024-05-11

The Watch Your Mouth dataset is a custom silent speech dataset consisting of depth-only recordings of users silently mouthing full English sentences, captured using consumer-grade depth cameras such as the iPhone TrueDepth sensor. Sentences were carefully curated to cover diverse visemic and phonetic patterns, supporting the development of models capable of generalizing across varied speech content. Each sentence-level utterance provides a temporally aligned depth sequence and corresponding ground truth text. Please see more details in the paper Watch Your Mouth: Silent Speech Recognition with Depth Sensing.