Datasets

19,997 machine learning datasets

19,997 dataset results

Wiki-ZSL

The Wiki-ZSL (Wiki Zero-Shot Learning) dataset contains 113 relations and 94,383 instances from Wikipedia. The dataset is divided into three subsets: training set (98 relations), validation set (5 relations) and test set (10 relations).

24 papers1 benchmarksTexts

VocalSound

VocalSound is a free dataset consisting of 21,024 crowdsourced recordings of laughter, sighs, coughs, throat clearing, sneezes, and sniffs from 3,365 unique subjects. The VocalSound dataset also contains meta-information such as speaker age, gender, native language, country, and health condition.

24 papers2 benchmarksAudio

twitch-gamers

node classification on twitch-gamers

24 papers2 benchmarksGraphs

KITTI-STEP

The Segmenting and Tracking Every Pixel (STEP) benchmark consists of 21 training sequences and 29 test sequences. It is based on the KITTI Tracking Evaluation and the Multi-Object Tracking and Segmentation (MOTS) benchmark. This benchmark extends the annotations to the Segmenting and Tracking Every Pixel (STEP) task. [Copy-pasted from http://www.cvlibs.net/datasets/kitti/eval_step.php]

24 papers15 benchmarksImages

105,941 Images Natural Scenes OCR Data of 12 Languages

Description: 105,941 Images Natural Scenes OCR Data of 12 Languages. The data covers 12 languages (6 Asian languages, 6 European languages), multiple natural scenes, multiple photographic angles. For annotation, line-level quadrilateral bounding box annotation and transcription for the texts were annotated in the data. The data can be used for tasks such as OCR of multi-language.

24 papers0 benchmarksImages

FreeSolv (Free Solvation)

The FreeSolv database offers a curated collection of experimental and calculated hydration-free energies for small molecules in water. It includes both experimental values obtained from prior literature and calculated values based on simulations. The goal is to provide accurate hydration-free energy data, which is essential for understanding solvation properties and interactions of molecules in aqueous environments.

24 papers1 benchmarks

REALY (Region-aware benchmark based on the LYHM)

The REALY benchmark aims to introduce a region-aware evaluation pipeline to measure the fine-grained normalized mean square error (NMSE) of 3D face reconstruction methods from under-controlled image sets.

24 papers25 benchmarks3D, 3d meshes, Images, RGB-D

QM7

QM7 dataset is a subset of the GDB-13 database. GDB-13 contains nearly 1 billion stable and synthetically accessible organic molecules. In the QM7 subset, only molecules with up to 23 atoms are included. These atoms consist of carbon ©, nitrogen (N), oxygen (O), and sulfur (S). The total number of molecules in the QM7 dataset is 7165. Each molecule is represented using the Coulomb matrix, which captures the interactions between atoms.

24 papers2 benchmarks

MSU Video Upscalers: Quality Enhancement

The dataset aims to find the algorithms that produce the most visually pleasant image possible and generalize well to a broad range of content. It consists of 30 clips and contains 15 2D-animated segments losslessly recorded from various video games and 15 camera-shot segments from high-bitrate YUV444 sources. The complexity of clips varies significantly in terms of spatial and temporal indexes. Multiple bicubic downscaling mixed with sharpening is used to simulate complex real-world camera degradation. The authors used slight compression and YUV420 conversion to simulate a practical use case. 1920×1080 sources were downscaled to 480×270 input.

24 papers44 benchmarks

MMSE-HR (Multimodal Spontaneous Expression-Heart Rate dataset)

The MMSE-HR benchmark consists of a dataset of 102 videos from 40 subjects recorded at 1040x1392 raw resolution at 25fps. During the recordings, various stimuli such as videos, sounds, and smells are introduced to induce different emotional states in the subjects. The ground truth waveform for MMSE-HR is the blood pressure signal sampled at 1000Hz. The dataset contains a diverse distribution of skin colors in the Fitzpatrick scale (II=8, III=11, IV=17, V+VI=4).

24 papers20 benchmarksImages, Medical

eLife (Scientific Lay Summarization)

This dataset contains 4,828 full biomedical articles paired with non-technical lay summaries derived from the eLife scientific journal.

24 papers8 benchmarks

InterHuman

InterHuman is a multimodal dataset, named InterHuman. It consists of about 107M frames for diverse two-person interactions, with accurate skeletal motions and 16,756 natural language descriptions.

24 papers16 benchmarksImages, Texts

LOL-v2 (LOL-v2-real)

LOL-v2-real contains 689 low-/normal-light image pairs for training and 100 pairs for testing.

24 papers3 benchmarks

HarMeme

HarMeme is a benchmark dataset for hateful meme classification containing 3, 544 memes related to COVID-19 collected from the Internet

24 papers2 benchmarksImages

Ubuntu IRC

The Ubuntu IRC dataset is a valuable resource for research in natural language understanding and dialogue systems. Let me provide you with some details:

24 papers1 benchmarks

Amazon Sports (Amazon Sports 5-core)

24 papers1 benchmarks

TempQuestions

Here, we take a key step in this direction and release a new benchmark, TempQuestions, containing 1,271 questions, that are all temporal in nature, paired with their answers.

24 papers2 benchmarks

PASCAL Face

The PASCAL FACE dataset is a dataset for face detection and face recognition. It has a total of 851 images which are a subset of the PASCAL VOC and has a total of 1,341 annotations. These datasets contain only a few hundreds of images and have limited variations in face appearance.

23 papers6 benchmarksImages

PhysioNet Challenge 2012

The PhysioNet Challenge 2012 dataset is publicly available and contains the de-identified records of 8000 patients in Intensive Care Units (ICU). Each record consists of roughly 48 hours of multivariate time series data with up to 37 features recorded at various times from the patients during their stay such as respiratory rate, glucose etc.

23 papers17 benchmarksImages, Medical

CIFAR10-DVS

CIFAR10-DVS is an event-stream dataset for object classification. 10,000 frame-based images that come from CIFAR-10 dataset are converted into 10,000 event streams with an event-based sensor, whose resolution is 128×128 pixels. The dataset has an intermediate difficulty with 10 different classes. The repeated closed-loop smooth (RCLS) movement of frame-based images is adopted to implement the conversion. Due to the transformation, they produce rich local intensity changes in continuous time which are quantized by each pixel of the event-based camera.

23 papers2 benchmarksImages

PreviousPage 93 of 1000Next