TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Vibravox: A Dataset of French Speech Captured with Body-co...

Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors

Julien Hauret, Malo Olivier, Thomas Joubaud, Christophe Langrenne, Sarah Poirée, Véronique Zimpfer, Éric Bavu

2024-07-16Speech RecognitionAutomatic Speech Recognition (ASR)speech-recognitionSpeaker VerificationBandwidth ExtensionAutomatic Phoneme RecognitionSpeech Enhancement
PaperPDFCode(official)

Abstract

Vibravox is a dataset compliant with the General Data Protection Regulation (GDPR) containing audio recordings using five different body-conduction audio sensors: two in-ear microphones, two bone conduction vibration pickups, and a laryngophone. The dataset also includes audio data from an airborne microphone used as a reference. The Vibravox corpus contains 45 hours per sensor of speech samples and physiological sounds recorded by 188 participants under different acoustic conditions imposed by a high order ambisonics 3D spatializer. Annotations about the recording conditions and linguistic transcriptions are also included in the corpus. We conducted a series of experiments on various speech-related tasks, including speech recognition, speech enhancement, and speaker verification. These experiments were carried out using state-of-the-art models to evaluate and compare their performances on signals captured by the different audio sensors offered by the Vibravox dataset, with the aim of gaining a better grasp of their individual characteristics.

Results

TaskDatasetMetricValueModel
Speech RecognitionVibraVox (throat microphone)Test PER0.073medium wav2vec2.0
Speech RecognitionVibraVox (headset microphone)Test PER0.028medium wav2vec2.0
Speech RecognitionVibraVox (forehead accelerometer)Test PER0.046medium wav2vec2.0
Speech RecognitionVibraVox (soft in-ear microphone)Test PER0.041medium wav2vec2.0
Speech RecognitionVibraVox (rigid in-ear microphone)Test PER0.045medium wav2vec2.0
Speech RecognitionVibraVox (temple vibration pickup)Test PER0.142medium wav2vec2.0
Speaker VerificationVibraVox (soft in-ear microphone)Test EER0.0172ECAPA2
Speaker VerificationVibraVox (soft in-ear microphone)Test min-DCF0.1ECAPA2
Speaker VerificationVibraVox (temple vibration pickup)Test EER0.08ECAPA2
Speaker VerificationVibraVox (temple vibration pickup)Test min-DCF0.58ECAPA2
Speaker VerificationVibraVox (rigid in-ear microphone)Test EER0.0316ECAPA2
Speaker VerificationVibraVox (rigid in-ear microphone)Test min-DCF0.21ECAPA2
Speaker VerificationVibraVox (forehead accelerometer)Test EER0.009ECAPA2
Speaker VerificationVibraVox (forehead accelerometer)Test min-DCF0.06ECAPA2
Speaker VerificationVibraVox (throat microphone)Test EER0.0353ECAPA2
Speaker VerificationVibraVox (throat microphone)Test min-DCF0.2ECAPA2
Speaker VerificationVibraVox (headset microphone)Test EER0.0026ECAPA2
Speaker VerificationVibraVox (headset microphone)Test min-DCF0.02ECAPA2
Speech EnhancementVibraVox (forehead accelerometer)EER (ECAPA2)0.0183Configurable EBEN (M=4, P=4, Q=4)
Speech EnhancementVibraVox (forehead accelerometer)Noresqua-MOS4.25Configurable EBEN (M=4, P=4, Q=4)
Speech EnhancementVibraVox (forehead accelerometer)PER (wav2vec2)0.091Configurable EBEN (M=4, P=4, Q=4)
Speech EnhancementVibraVox (forehead accelerometer)STOI0.855Configurable EBEN (M=4, P=4, Q=4)
Speech EnhancementVibraVox (temple vibration pickup)EER (ECAPA2)0.1622Configurable EBEN (M=4, P=1, Q=4)
Speech EnhancementVibraVox (temple vibration pickup)Noresqua-MOS3.632Configurable EBEN (M=4, P=1, Q=4)
Speech EnhancementVibraVox (temple vibration pickup)PER (wav2vec2)0.391Configurable EBEN (M=4, P=1, Q=4)
Speech EnhancementVibraVox (temple vibration pickup)STOI0.763Configurable EBEN (M=4, P=1, Q=4)
Speech EnhancementVibraVox (throat microphone)EER (ECAPA2)0.0847Configurable EBEN (M=4, P=2, Q=4)
Speech EnhancementVibraVox (throat microphone)Noresqua-MOS3.862Configurable EBEN (M=4, P=2, Q=4)
Speech EnhancementVibraVox (throat microphone)PER (wav2vec2)0.179Configurable EBEN (M=4, P=2, Q=4)
Speech EnhancementVibraVox (throat microphone)STOI0.834Configurable EBEN (M=4, P=2, Q=4)
Speech EnhancementVibraVox (soft in-ear microphone)EER (ECAPA2)0.0488Configurable EBEN (M=4, P=2, Q=4)
Speech EnhancementVibraVox (soft in-ear microphone)Noresqua-MOS4.331Configurable EBEN (M=4, P=2, Q=4)
Speech EnhancementVibraVox (soft in-ear microphone)PER (wav2vec2)0.087Configurable EBEN (M=4, P=2, Q=4)
Speech EnhancementVibraVox (soft in-ear microphone)STOI0.868Configurable EBEN (M=4, P=2, Q=4)
Speech EnhancementVibraVox (rigid in-ear microphone)EER (ECAPA2)0.0364Configurable EBEN (M=4, P=2, Q=4)
Speech EnhancementVibraVox (rigid in-ear microphone)Noresqua-MOS4.285Configurable EBEN (M=4, P=2, Q=4)
Speech EnhancementVibraVox (rigid in-ear microphone)PER (wav2vec2)0.084Configurable EBEN (M=4, P=2, Q=4)
Speech EnhancementVibraVox (rigid in-ear microphone)STOI0.877Configurable EBEN (M=4, P=2, Q=4)
Automatic Speech Recognition (ASR)VibraVox (throat microphone)Test PER0.073medium wav2vec2.0
Automatic Speech Recognition (ASR)VibraVox (headset microphone)Test PER0.028medium wav2vec2.0
Automatic Speech Recognition (ASR)VibraVox (forehead accelerometer)Test PER0.046medium wav2vec2.0
Automatic Speech Recognition (ASR)VibraVox (soft in-ear microphone)Test PER0.041medium wav2vec2.0
Automatic Speech Recognition (ASR)VibraVox (rigid in-ear microphone)Test PER0.045medium wav2vec2.0
Automatic Speech Recognition (ASR)VibraVox (temple vibration pickup)Test PER0.142medium wav2vec2.0

Related Papers

Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17NonverbalTTS: A Public English Corpus of Text-Aligned Nonverbal Vocalizations with Emotion Annotations for Text-to-Speech2025-07-17SHIELD: A Secure and Highly Enhanced Integrated Learning for Robust Deepfake Detection against Adversarial Attacks2025-07-17Autoregressive Speech Enhancement via Acoustic Tokens2025-07-17P.808 Multilingual Speech Enhancement Testing: Approach and Results of URGENT 2025 Challenge2025-07-15WhisperKit: On-device Real-time ASR with Billion-Scale Transformers2025-07-14VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis2025-07-08Robust One-step Speech Enhancement via Consistency Distillation2025-07-08