TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Audio/Audio Generation/VGG-Sound

Audio Generation on VGG-Sound

Metric: FAD (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕FAD▼Extra DataPaperDate↕Code
1VATT-LLama2.38NoTell What You Hear From What You See -- Video to...2024-11-08Code
2ReWas2.16NoRead, Watch and Scream! Sound Generation from Te...2024-07-08Code
3MaskVAT_Hybrid2.04NoMasked Generative Video-to-Audio Transformers wi...2024-07-15-
4V-AURA1.92NoTemporally Aligned Audio for Video with Autoregr...2024-09-20Code
5Frieren1.32NoFrieren: Efficient Video-to-Audio Generation Net...2024-06-01Code
6MMAudio-L-44.1kHz0.97NoMMAudio: Taming Multimodal Joint Training for Hi...2024-12-19Code
7V2A-Mapper0.841NoV2A-Mapper: A Lightweight Solution for Vision-to...2023-08-18Code
8MMAudio-S-16kHz0.79NoMMAudio: Taming Multimodal Joint Training for Hi...2024-12-19Code