Text to Audio Retrieval on Clotho

Metric: mAP@10 (higher is better)

LeaderboardDataset
Loading chart...
#ModelmAP@10Extra DataPaperDateCode
1PaSST-RoBERTa & Estimated Audio–Caption Correspondences40.14YesEstimated Audio-Caption Correspondences Improve ...2024-08-21Code
2PaSST–RoBERTa & GPT-augment38.56YesAdvancing Natural-Language Based Audio Retrieval...2023-08-08Code