Mivia Audio Events Dataset

Audio

The MIVIA audio events data set is composed of a total of 6000 events for surveillance applications, namely glass breaking, gun shots and screams. The 6000 events are divided into a training set (composed of 4200 events) and a test set (composed of 1800 events).

In audio surveillance applications, the events of interest (for instance a scream) can occur at different distances from the microphone that correspond to different levels of the signal-to-noise ratio. Moreover, in these applications the events are generally mixed with a complex background, usually composed of several types of different sounds depending on the specific environments both indoor and outdoor (household appliances, cheering of crowds, talking people, traffic jam, passing cars or motorbikes etc.).

The data set is designed to provide each audio event at 6 different values of signal-to-noise ratio (namely 5dB, 10dB, 15dB, 20dB, 25dB and 30dB) and overimposed to different combinations of environmental sounds in order to simulate their occurrence in different ambiences.