19,997 machine learning datasets
19,997 dataset results
The Wiki-ZSL (Wiki Zero-Shot Learning) dataset contains 113 relations and 94,383 instances from Wikipedia. The dataset is divided into three subsets: training set (98 relations), validation set (5 relations) and test set (10 relations).
VocalSound is a free dataset consisting of 21,024 crowdsourced recordings of laughter, sighs, coughs, throat clearing, sneezes, and sniffs from 3,365 unique subjects. The VocalSound dataset also contains meta-information such as speaker age, gender, native language, country, and health condition.
node classification on twitch-gamers
The Segmenting and Tracking Every Pixel (STEP) benchmark consists of 21 training sequences and 29 test sequences. It is based on the KITTI Tracking Evaluation and the Multi-Object Tracking and Segmentation (MOTS) benchmark. This benchmark extends the annotations to the Segmenting and Tracking Every Pixel (STEP) task. [Copy-pasted from http://www.cvlibs.net/datasets/kitti/eval_step.php]
Description: 105,941 Images Natural Scenes OCR Data of 12 Languages. The data covers 12 languages (6 Asian languages, 6 European languages), multiple natural scenes, multiple photographic angles. For annotation, line-level quadrilateral bounding box annotation and transcription for the texts were annotated in the data. The data can be used for tasks such as OCR of multi-language.
The FreeSolv database offers a curated collection of experimental and calculated hydration-free energies for small molecules in water. It includes both experimental values obtained from prior literature and calculated values based on simulations. The goal is to provide accurate hydration-free energy data, which is essential for understanding solvation properties and interactions of molecules in aqueous environments.
The REALY benchmark aims to introduce a region-aware evaluation pipeline to measure the fine-grained normalized mean square error (NMSE) of 3D face reconstruction methods from under-controlled image sets.
QM7 dataset is a subset of the GDB-13 database. GDB-13 contains nearly 1 billion stable and synthetically accessible organic molecules. In the QM7 subset, only molecules with up to 23 atoms are included. These atoms consist of carbon ©, nitrogen (N), oxygen (O), and sulfur (S). The total number of molecules in the QM7 dataset is 7165. Each molecule is represented using the Coulomb matrix, which captures the interactions between atoms.
The dataset aims to find the algorithms that produce the most visually pleasant image possible and generalize well to a broad range of content. It consists of 30 clips and contains 15 2D-animated segments losslessly recorded from various video games and 15 camera-shot segments from high-bitrate YUV444 sources. The complexity of clips varies significantly in terms of spatial and temporal indexes. Multiple bicubic downscaling mixed with sharpening is used to simulate complex real-world camera degradation. The authors used slight compression and YUV420 conversion to simulate a practical use case. 1920×1080 sources were downscaled to 480×270 input.
The MMSE-HR benchmark consists of a dataset of 102 videos from 40 subjects recorded at 1040x1392 raw resolution at 25fps. During the recordings, various stimuli such as videos, sounds, and smells are introduced to induce different emotional states in the subjects. The ground truth waveform for MMSE-HR is the blood pressure signal sampled at 1000Hz. The dataset contains a diverse distribution of skin colors in the Fitzpatrick scale (II=8, III=11, IV=17, V+VI=4).
This dataset contains 4,828 full biomedical articles paired with non-technical lay summaries derived from the eLife scientific journal.
InterHuman is a multimodal dataset, named InterHuman. It consists of about 107M frames for diverse two-person interactions, with accurate skeletal motions and 16,756 natural language descriptions.
LOL-v2-real contains 689 low-/normal-light image pairs for training and 100 pairs for testing.
HarMeme is a benchmark dataset for hateful meme classification containing 3, 544 memes related to COVID-19 collected from the Internet
The Ubuntu IRC dataset is a valuable resource for research in natural language understanding and dialogue systems. Let me provide you with some details:
Amazon Sports (Amazon Sports 5-core)
Here, we take a key step in this direction and release a new benchmark, TempQuestions, containing 1,271 questions, that are all temporal in nature, paired with their answers.
The PASCAL FACE dataset is a dataset for face detection and face recognition. It has a total of 851 images which are a subset of the PASCAL VOC and has a total of 1,341 annotations. These datasets contain only a few hundreds of images and have limited variations in face appearance.
The PhysioNet Challenge 2012 dataset is publicly available and contains the de-identified records of 8000 patients in Intensive Care Units (ICU). Each record consists of roughly 48 hours of multivariate time series data with up to 37 features recorded at various times from the patients during their stay such as respiratory rate, glucose etc.
CIFAR10-DVS is an event-stream dataset for object classification. 10,000 frame-based images that come from CIFAR-10 dataset are converted into 10,000 event streams with an event-based sensor, whose resolution is 128×128 pixels. The dataset has an intermediate difficulty with 10 different classes. The repeated closed-loop smooth (RCLS) movement of frame-based images is adopted to implement the conversion. Due to the transformation, they produce rich local intensity changes in continuous time which are quantized by each pixel of the event-based camera.