TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/WildDESED: An LLM-Powered Dataset for Wild Domestic Enviro...

WildDESED: An LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection System

Yang Xiao, Rohan Kumar Das

2024-07-04Sound Event DetectionEvent DetectionLarge Language ModelLanguage Modelling
PaperPDFCode(official)

Abstract

This work aims to advance sound event detection (SED) research by presenting a new large language model (LLM)-powered dataset namely wild domestic environment sound event detection (WildDESED). It is crafted as an extension to the original DESED dataset to reflect diverse acoustic variability and complex noises in home settings. We leveraged LLMs to generate eight different domestic scenarios based on target sound categories of the DESED dataset. Then we enriched the scenarios with a carefully tailored mixture of noises selected from AudioSet and ensured no overlap with target sound. We consider widely popular convolutional neural recurrent network to study WildDESED dataset, which depicts its challenging nature. We then apply curriculum learning by gradually increasing noise complexity to enhance the model's generalization capabilities across various noise levels. Our results with this approach show improvements within the noisy environment, validating the effectiveness on the WildDESED dataset promoting noise-robust SED advancements.

Results

TaskDatasetMetricValueModel
Sound Event DetectionWildDESEDPSDS1 (-5dB)0.049CRNN (WildDESED + Curriculrm learning)
Sound Event DetectionWildDESEDPSDS1 (0dB)0.114CRNN (WildDESED + Curriculrm learning)
Sound Event DetectionWildDESEDPSDS1 (10dB)0.212CRNN (WildDESED + Curriculrm learning)
Sound Event DetectionWildDESEDPSDS1 (5dB)0.175CRNN (WildDESED + Curriculrm learning)
Sound Event DetectionWildDESEDPSDS1 (Clean)0.265CRNN (WildDESED + Curriculrm learning)
Sound Event DetectionWildDESEDPSDS1 (-5dB)0.048CRNN (WildDESED)
Sound Event DetectionWildDESEDPSDS1 (0dB)0.087CRNN (WildDESED)
Sound Event DetectionWildDESEDPSDS1 (10dB)0.175CRNN (WildDESED)
Sound Event DetectionWildDESEDPSDS1 (5dB)0.135CRNN (WildDESED)
Sound Event DetectionWildDESEDPSDS1 (Clean)0.2CRNN (WildDESED)
Sound Event DetectionWildDESEDPSDS1 (-5dB)0.017CRNN
Sound Event DetectionWildDESEDPSDS1 (0dB)0.064CRNN
Sound Event DetectionWildDESEDPSDS1 (10dB)0.222CRNN
Sound Event DetectionWildDESEDPSDS1 (5dB)0.148CRNN
Sound Event DetectionWildDESEDPSDS1 (Clean)0.348CRNN

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits2025-07-18GeoReg: Weight-Constrained Few-Shot Regression for Socio-Economic Estimation using LLM2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities2025-07-17Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17