TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Audio/Audio Generation/AudioCaps

Audio Generation on AudioCaps

Metric: FD (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕FD▼Extra DataPaperDate↕Code
1Diffsound47.68NoDiffsound: Discrete Diffusion Model for Text-to-...2022-07-20Code
2AudioLDM2-large26.18NoAudioLDM 2: Learning Holistic Audio Generation w...2023-08-10Code
3TANGO24.52NoText-to-Audio Generation using Instruction-Tuned...2023-04-24Code
4AudioLDM-L-Full23.31NoAudioLDM: Text-to-Audio Generation with Latent D...2023-01-29Code
5Auffusion-Full23.08NoAuffusion: Leveraging the Power of Diffusion and...2024-01-02Code
6CoDi22.9NoAny-to-Any Generation via Composable Diffusion2023-05-19Code
7Auffusion21.99NoAuffusion: Leveraging the Power of Diffusion and...2024-01-02Code
8Consistency TTA (Single-step generation)20.44NoConsistencyTTA: Accelerating Diffusion-Based Tex...2023-09-19Code
9Make-An-Audio18.32NoMake-An-Audio: Text-To-Audio Generation with Pro...2023-01-30Code
10Tango-AF&AC-FT-AC17.19NoImproving Text-To-Audio Models with Synthetic Ca...2024-06-18Code
11GenAu-Large16.51NoTaming Data and Transformers for Audio Generation2024-06-27Code
12ETTA13.12NoETTA: Elucidating the Design Space of Text-to-Au...2024-12-26Code
13Make-An-Audio 211.75NoMake-An-Audio 2: Temporal-Enhanced Text-to-Audio...2023-05-29Code
14ETTA-FT-AC-100k10.1NoETTA: Elucidating the Design Space of Text-to-Au...2024-12-26Code
15Audiobox Sound8.3NoAudiobox: Unified Audio Generation with Natural ...2023-12-25-