Datasets

48 machine learning datasets

48 dataset results

ConductorMotion100

We construct a large-scale conducting motion dataset, named ConductorMotion100, by deploying pose estimation on conductor view videos of concert performance recordings collected from online video platforms. The construction of ConductorMotion100 removes the need for expensive motion-capture equipment and makes full use of massive online video resources. As a result, the scale of ConductorMotion100 has reached an unprecedented length of 100 hours.

1 papers0 benchmarksAudio, Music

inaGVAD (InaGVAD : a Challenging French TV and Radio Corpus annotated for Voice Activity Detection and Speaker Gender Segmentation)

InaGVAD is a Voice Activity Detection (VAD) and Speaker Gender Segmentation (SGS) dataset designed for representing the acoustic diversity of French TV and Radio programs. InaGVAD detailed description, together with a benchmark of 6 freely available VAD systems and 3 SGS systems, is provided in a paper presented in LREC-COLING 2024.

1 papers0 benchmarksAudio, Music, Speech

Filosax

48 multitrack jazz recordings with many annotations.

1 papers2 benchmarksMusic

XMIDI

XMIDI is a comprehensive, large-scale symbolic music dataset that includes accurate emotion and genre labels, consisting of 108,023 MIDI files. The average duration of the music pieces is approximately 176 seconds, yielding a total dataset length of around 5,278 hours.

1 papers0 benchmarksAudio, Music

taste-music-dataset (Taste Music Dataset)

This dataset is a patched version of The Taste & Affect Music Database by D. Guedes et al. It is a set of captions that describe 100 musical pieces and associate with them gustatory keywords on the basis of Guedes findings.

1 papers0 benchmarksAudio, Music, Texts

JamendoMaxCaps

📊 Dataset Details

1 papers0 benchmarksAudio, Music, Texts

[[Talk!!Person]]How do I talk to a person at Expedia?

You can contact Expedia by phone at +1-805-330-4056 if you're in Mexico, or +1-888-829-0881 if you're in the United States. Both numbers are available 24 hours a day and offer support in Spanish to help you with any travel-related questions.

1 papers0 benchmarksMusic

Cadenza Woodwind

This publicly available data is synthesised audio for woodwind quartets including renderings of each instrument in isolation. The data was created to be used as training data within Cadenza's second open machine learning challenge (CAD2) for the task on rebalancing classical music ensembles. The dataset is also intended for developing other music information retrieval (MIR) algorithms using machine learning. It was created because of the lack of large-scale datasets of classical woodwind music with separate audio for each instrument and permissive license for reuse. Music scores were selected from the OpenScore String Quartet corpus. These were rendered for two woodwind ensembles of (i) flute, oboe, clarinet and bassoon; and (ii) flute, oboe, alto saxophone and bassoon. This was done by a professional music producer using industry-standard software. Virtual instruments were used to create the audio for each instrument using software that interpreted expression markings in the score. Co

0 papers0 benchmarksAudio, Music, Stereo

PreviousPage 3 of 3