Metric: CLAP_LAION (higher is better)
| # | Model↕ | CLAP_LAION▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | Audiobox Sound | 0.71 | No | Audiobox: Unified Audio Generation with Natural ... | 2023-12-25 | - |
| 2 | ETTA-FT-AC-100k | 0.6 | No | ETTA: Elucidating the Design Space of Text-to-Au... | 2024-12-26 | Code |
| 3 | ETTA | 0.54 | No | ETTA: Elucidating the Design Space of Text-to-Au... | 2024-12-26 | Code |
| 4 | AudioLDM2-large | 0.53 | No | AudioLDM 2: Learning Holistic Audio Generation w... | 2023-08-10 | Code |
| 5 | Tango-AF&AC-FT-AC | 0.527 | No | Improving Text-To-Audio Models with Synthetic Ca... | 2024-06-18 | Code |
| 6 | TangoFlux | 0.488 | No | TangoFlux: Super Fast and Faithful Text to Audio... | 2024-12-30 | Code |
| 7 | TangoFlux-base | 0.438 | No | TangoFlux: Super Fast and Faithful Text to Audio... | 2024-12-30 | Code |
| 8 | Stable Audio | 0.41 | No | Fast Timing-Conditioned Latent Audio Diffusion | 2024-02-07 | Code |
| 9 | Stable Audio Open | 0.35 | No | Stable Audio Open | 2024-07-19 | Code |
| 10 | AudioLDM 2-AC-Large | 0.243 | No | AudioLDM 2: Learning Holistic Audio Generation w... | 2023-08-10 | Code |