Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Audio
/
Audio Generation
/
AudioCaps
Audio Generation on AudioCaps
Metric: FD (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
#
Model
↕
FD
▼
Extra Data
Paper
Date
↕
Code
1
Diffsound
47.68
No
Diffsound: Discrete Diffusion Model for Text-to-...
2022-07-20
Code
2
AudioLDM2-large
26.18
No
AudioLDM 2: Learning Holistic Audio Generation w...
2023-08-10
Code
3
TANGO
24.52
No
Text-to-Audio Generation using Instruction-Tuned...
2023-04-24
Code
4
AudioLDM-L-Full
23.31
No
AudioLDM: Text-to-Audio Generation with Latent D...
2023-01-29
Code
5
Auffusion-Full
23.08
No
Auffusion: Leveraging the Power of Diffusion and...
2024-01-02
Code
6
CoDi
22.9
No
Any-to-Any Generation via Composable Diffusion
2023-05-19
Code
7
Auffusion
21.99
No
Auffusion: Leveraging the Power of Diffusion and...
2024-01-02
Code
8
Consistency TTA (Single-step generation)
20.44
No
ConsistencyTTA: Accelerating Diffusion-Based Tex...
2023-09-19
Code
9
Make-An-Audio
18.32
No
Make-An-Audio: Text-To-Audio Generation with Pro...
2023-01-30
Code
10
Tango-AF&AC-FT-AC
17.19
No
Improving Text-To-Audio Models with Synthetic Ca...
2024-06-18
Code
11
GenAu-Large
16.51
No
Taming Data and Transformers for Audio Generation
2024-06-27
Code
12
ETTA
13.12
No
ETTA: Elucidating the Design Space of Text-to-Au...
2024-12-26
Code
13
Make-An-Audio 2
11.75
No
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio...
2023-05-29
Code
14
ETTA-FT-AC-100k
10.1
No
ETTA: Elucidating the Design Space of Text-to-Au...
2024-12-26
Code
15
Audiobox Sound
8.3
No
Audiobox: Unified Audio Generation with Natural ...
2023-12-25
-