SONICS

Synthetic Or Not - Identifying Counterfeit Songs

AudioCC BY-NC 4.0Introduced 2024-08-26

SONICS is a large-scale dataset comprising 97,164 songs48,090 real songs from YouTube and 49,074 fake songs from Suno & Udio — designed for synthetic song detection (SSD), also known as fake song detection (FSD). It addresses several limitations of existing datasets, such as the lack of end-to-end fake songs, limited diversity in music-lyrics, and insufficient long-duration songs. The average length of the songs in SONICS is 176 seconds, which enables the capture of long-context relationships. Moreover, SONICS provides open access to generated fake songs and is divided into 66,709 songs for training, 26,015 songs for testing, and 4,440 songs for validation. Additionally, the inclusion of song lyrics in SONICS dataset paves the way for future research in this field.