19,997 machine learning datasets
19,997 dataset results
BabyLM is a dataset for small scale language modeling, human language acquisition, low-resource NLP, and cognitive modeling. In partnership with CoNLL and CMCL, it provides a platform for approaches to pretraining with a limited-size corpus sourced from data inspired by the input to children. The task has three tracks, two of which restrict the training data to pre-released datasets of 10M and 100M words and are dedicated to explorations of approaches such as architectural variations, self-supervised objectives, or curriculum learning. The final track only restricts the amount of text used, allowing innovation in the choice of the data, its domain, and even its modality (i.e., data from sources other than text is welcome).
WHOOPS! Is a dataset and benchmark for visual commonsense. The dataset is comprised of purposefully commonsense-defying images created by designers using publicly-available image generation tools like Midjourney. It contains commonsense-defying image from a wide range of reasons, deviations from expected social norms and everyday knowledge.
SRRS (Snow Removal in Realistic Scenario) contains 15000 synthesized snow images and 1000 snow images in real scenarios downloaded from the Internet.
A new large scale plane geometry problem solving dataset called PGPS9K, labeled both fine-grained diagram annotation and interpretable solution program.
M3KE is a Massive Multi-Level Multi-Subject Knowledge Evaluation benchmark, which is developed to measure knowledge acquired by Chinese large language models by testing their multitask accuracy in zero- and few-shot settings. We have collected 20,477 questions from 71 tasks. Our selection covers all major levels of Chinese education system, ranging from the primary school to college, as well as a wide variety of subjects, including humanities, history, politics, law, education, psychology, science, technology, art and religion. All questions are multiple-choice questions with four options, hence guaranteeing a standardized and unified assessment process.
Purpose Medical imaging has become increasingly important in diagnosing and treating oncological patients, particularly in radiotherapy. Recent advances in synthetic computed tomography (sCT) generation have increased interest in public challenges to provide data and evaluation metrics for comparing different approaches openly. This paper describes a dataset of brain and pelvis computed tomography (CT) images with rigidly registered cone-beam CT (CBCT) and magnetic resonance imaging (MRI) images to facilitate the development and evaluation of sCT generation for radiotherapy planning.
OxIOD Dataset Oxford Inertial Odometry Dataset [<a id="d1" href="#oxiod">1</a>] is a large set of inertial data for inertial odometry which is recorded by smartphones at 100 Hz in indoor environment. The suite consists of 158 tests and covers a distance of over 42 km, with OMC ground track available for 132 tests. Therefore, it does not include pure rotational movements and pure translational movements, which are helpful for systematically evaluating the model's performance under different conditions; however, it covers a wide range of everyday movements.
GlobalOpinionQA consists of questions and answers from cross-national surveys designed to capture diverse opinions on global issues across different countries. It contains a subset of survey questions about global issues and opinions adapted from the World Values Survey and Pew Global Attitudes Survey.
Open-Platypus is a family of fine-tuned and merged Large Language Models (LLMs) that achieves the strongest performance and currently stands at first place in HuggingFace's Open LLM Leaderboard.
The dataset used for NTIRE 2022 Spectral Recovery Challenge
The dataset presents open high-resolution test clips set with different types of content: movie fragments, sport streams, live caption clips. Used clips of 1920×1080 resolution and with duration from 13 to 38 seconds. And Performed reliable data collection from 50 observers (19–24 y. o.) using 500 Hz SMI iViewXTM Hi-Speed 1250 eye-tracker. Also used cross-fade which ensures the independence of the received fixations between different clips. The final ground-truth saliency map was estimated as a Gaussian mixture with centers at the fixation points. A standard deviation for the Gaussians equal to 120 was chosen (this value matches 8 angular degrees, which is known to be the sector of sharp vision).
Nutrition5k is a dataset of visual and nutritional data for ~5k realistic plates of food captured from Google cafeterias using a custom scanning rig. We are releasing this dataset alongside our recent CVPR 2021 paper to help promote research in visual nutrition understanding. Please see the paper for more details on the dataset and follow-up experiments.
3D video data asset of CVPR 2022 Paper "Neural 3D Video Synthesis"
VisIT-Bench is a new vision-language instruction following benchmark inspired by real-world use cases. Testing 70 diverse “wish-list” skills with an automated ranking system, it advances the ongoing assessment of multimodal chatbot performance.
This dataset was introduced by [1], but was not used in its experiment. [2] propose to use all the clips included in the first 7 games as a training set, and the remainings as a testing set.
Image quality assessment (IQA) databases enable researchers to evaluate the performance of IQA algorithms and contribute towards attaining the ultimate goal of objective quality assessment research - matching human perception. Most publicly available image quality databases have been created under highly controlled conditions by introducing graded simulated distortions onto high-quality photographs. However, images captured using typical real-world mobile camera devices are usually afflicted by complex mixtures of multiple distortions, which are not necessarily well-modeled by the synthetic distortions found in existing databases. Our newly designed and created LIVE In the Wild Image Quality Challenge Database, contains widely diverse authentic image distortions on a large number of images captured using a representative variety of modern mobile devices. We also designed and implemented a new online crowdsourcing system, which we have used to conduct a very large-scale, multi-month ima
The Multi-domain Mobile Video Physiology Dataset (MMPD), comprising 11 hours(1152K frames) of recordings from mobile phones of 33 subjects. The dataset was designed to capture videos with greater representation across skin tone, body motion, and lighting conditions. MMPD is comprehensive with eight descriptive labels and can be used in conjunction with the rPPG-toolbox and PhysBench. MMPD is widely used for rPPG tasks and remote heart rate estimation. To access the dataset, you are supposed to download this data release agreement and request downloading by email.
Amazon-Sports is a sub-category of the Amazon dataset, which contains a series of product reviews crawled from Amazon.com.
The dataset is based on the original MNIST dataset. Compared to the original dataset, the digits are scaled down by a factor of $0.75$ such that there is more space for the random translation.The PolyMNIST consists of 5 different modalities.
Yu Luo, Jianbo Ye, Reginald B. Adams, Jr., Jia Li, Michelle G. Newman and James Z. Wang, ARBEE: Towards Automated Recognition of Bodily Expression of Emotion In the Wild,'' International Journal of Computer Vision, vol. 128, no. 1, pp. 1-25, 2020.