TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/AudioCaps

AudioCaps

AudioTextsUnknownIntroduced 2019-06-01

AudioCaps is a dataset of sounds with event descriptions that was introduced for the task of audio captioning, with sounds sourced from the AudioSet dataset. Annotators were provided the audio tracks together with category hints (and with additional video hints if needed).

Source: Audio Retrieval with Natural Language Queries

Image source: https://audiocaps.github.io/

Benchmarks

Audio Generation/FD_openl3Audio Generation/FADAudio Generation/FDAudio Generation/KL_passtAudio Generation/ISAudio Generation/CLAP_LAIONAudio Generation/CLAP_MSAudio Source Separation/SDRiAudio Source Separation/SI-SDRiAudio captioning/SPIDErAudio captioning/CIDErAudio captioning/SPICEAudio captioning/BLEU-4Audio captioning/METEORAudio captioning/ROUGE-LAudio captioning/FENSEAudio captioning/SPIDEr-FLAudio captioning/#params (M)Audio captioning/ROUGEAudio captioning/Sentence-BERTTarget Sound Extraction/SDRiTarget Sound Extraction/SI-SDRiText to Audio Retrieval/R@1Text to Audio Retrieval/R@5Text to Audio Retrieval/R@10

Statistics

Papers
279
Benchmarks
25

Links

Homepage

Tasks

Audio GenerationAudio Source SeparationAudio captioningAudio to Text RetrievalAudio/Video to Text RetrievalRetrieval-augmented Few-shot In-context Audio CaptioningTarget Sound ExtractionText to Audio RetrievalText to Audio/Video RetrievalZero-Shot Audio RetrievalZero-shot Audio CaptioningZero-shot Text to Audio Retrieval