Datasets

19,997 machine learning datasets

19,997 dataset results

BabyLM

BabyLM is a dataset for small scale language modeling, human language acquisition, low-resource NLP, and cognitive modeling. In partnership with CoNLL and CMCL, it provides a platform for approaches to pretraining with a limited-size corpus sourced from data inspired by the input to children. The task has three tracks, two of which restrict the training data to pre-released datasets of 10M and 100M words and are dedicated to explorations of approaches such as architectural variations, self-supervised objectives, or curriculum learning. The final track only restricts the amount of text used, allowing innovation in the choice of the data, its domain, and even its modality (i.e., data from sources other than text is welcome).

14 papers0 benchmarksTexts

WHOOPS!

WHOOPS! Is a dataset and benchmark for visual commonsense. The dataset is comprised of purposefully commonsense-defying images created by designers using publicly-available image generation tools like Midjourney. It contains commonsense-defying image from a wide range of reasons, deviations from expected social norms and everyday knowledge.

14 papers7 benchmarksImages, Texts

SRRS (Snow Removal in Realistic Scenario)

SRRS (Snow Removal in Realistic Scenario) contains 15000 synthesized snow images and 1000 snow images in real scenarios downloaded from the Internet.

14 papers0 benchmarks

PGPS9K

A new large scale plane geometry problem solving dataset called PGPS9K, labeled both fine-grained diagram annotation and interpretable solution program.

14 papers1 benchmarksImages, Texts

M3KE (Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark)

M3KE is a Massive Multi-Level Multi-Subject Knowledge Evaluation benchmark, which is developed to measure knowledge acquired by Chinese large language models by testing their multitask accuracy in zero- and few-shot settings. We have collected 20,477 questions from 71 tasks. Our selection covers all major levels of Chinese education system, ranging from the primary school to college, as well as a wide variety of subjects, including humanities, history, politics, law, education, psychology, science, technology, art and religion. All questions are multiple-choice questions with four options, hence guaranteeing a standardized and unified assessment process.

14 papers0 benchmarksTexts

SynthRAD2023

Purpose Medical imaging has become increasingly important in diagnosing and treating oncological patients, particularly in radiotherapy. Recent advances in synthetic computed tomography (sCT) generation have increased interest in public challenges to provide data and evaluation metrics for comparing different approaches openly. This paper describes a dataset of brain and pelvis computed tomography (CT) images with rigidly registered cone-beam CT (CBCT) and magnetic resonance imaging (MRI) images to facilitate the development and evaluation of sCT generation for radiotherapy planning.

14 papers0 benchmarks3D, Images, Medical

OxIOD (Oxford Inertial Odometry Dataset)

OxIOD Dataset Oxford Inertial Odometry Dataset [<a id="d1" href="#oxiod">1</a>] is a large set of inertial data for inertial odometry which is recorded by smartphones at 100 Hz in indoor environment. The suite consists of 158 tests and covers a distance of over 42 km, with OMC ground track available for 132 tests. Therefore, it does not include pure rotational movements and pure translational movements, which are helpful for systematically evaluating the model's performance under different conditions; however, it covers a wide range of everyday movements.

14 papers0 benchmarks

GlobalOpinionQA

GlobalOpinionQA consists of questions and answers from cross-national surveys designed to capture diverse opinions on global issues across different countries. It contains a subset of survey questions about global issues and opinions adapted from the World Values Survey and Pew Global Attitudes Survey.

14 papers0 benchmarksTexts

Open-Platypus

Open-Platypus is a family of fine-tuned and merged Large Language Models (LLMs) that achieves the strongest performance and currently stands at first place in HuggingFace's Open LLM Leaderboard.

14 papers0 benchmarksImages

ARAD-1K (Ntire 2022 spectral recovery challenge and data set)

The dataset used for NTIRE 2022 Spectral Recovery Challenge

14 papers6 benchmarks

MSU Video Saliency Prediction

The dataset presents open high-resolution test clips set with different types of content: movie fragments, sport streams, live caption clips. Used clips of 1920×1080 resolution and with duration from 13 to 38 seconds. And Performed reliable data collection from 50 observers (19–24 y. o.) using 500 Hz SMI iViewXTM Hi-Speed 1250 eye-tracker. Also used cross-fade which ensures the independence of the received fixations between different clips. The final ground-truth saliency map was estimated as a Gaussian mixture with centers at the fixation points. A standard deviation for the Gaussians equal to 120 was chosen (this value matches 8 angular degrees, which is known to be the sector of sharp vision).

14 papers6 benchmarks

Nutrition5k (Nutrition5k: A Comprehensive Nutrition Dataset)

Nutrition5k is a dataset of visual and nutritional data for ~5k realistic plates of food captured from Google cafeterias using a custom scanning rig. We are releasing this dataset alongside our recent CVPR 2021 paper to help promote research in visual nutrition understanding. Please see the paper for more details on the dataset and follow-up experiments.

14 papers0 benchmarks

Plenoptic Video Dataset

3D video data asset of CVPR 2022 Paper "Neural 3D Video Synthesis"

14 papers0 benchmarks

VisIT-Bench

VisIT-Bench is a new vision-language instruction following benchmark inspired by real-world use cases. Testing 70 diverse “wish-list” skills with an automated ranking system, it advances the ongoing assessment of multimodal chatbot performance.

14 papers0 benchmarks

Tennis

This dataset was introduced by [1], but was not used in its experiment. [2] propose to use all the clips included in the first 7 games as a training set, and the remainings as a testing set.

14 papers3 benchmarks

LIVE-itw (LIVE In the Wild Image Quality Challenge Database)

Image quality assessment (IQA) databases enable researchers to evaluate the performance of IQA algorithms and contribute towards attaining the ultimate goal of objective quality assessment research - matching human perception. Most publicly available image quality databases have been created under highly controlled conditions by introducing graded simulated distortions onto high-quality photographs. However, images captured using typical real-world mobile camera devices are usually afflicted by complex mixtures of multiple distortions, which are not necessarily well-modeled by the synthetic distortions found in existing databases. Our newly designed and created LIVE In the Wild Image Quality Challenge Database, contains widely diverse authentic image distortions on a large number of images captured using a representative variety of modern mobile devices. We also designed and implemented a new online crowdsourcing system, which we have used to conduct a very large-scale, multi-month ima

14 papers0 benchmarks

MMPD (Multi-Domain Mobile Video Physiology Dataset)

The Multi-domain Mobile Video Physiology Dataset (MMPD), comprising 11 hours(1152K frames) of recordings from mobile phones of 33 subjects. The dataset was designed to capture videos with greater representation across skin tone, body motion, and lighting conditions. MMPD is comprehensive with eight descriptive labels and can be used in conjunction with the rPPG-toolbox and PhysBench. MMPD is widely used for rPPG tasks and remote heart rate estimation. To access the dataset, you are supposed to download this data release agreement and request downloading by email.

14 papers0 benchmarksImages, Medical, Time series, Videos

PreviousPage 130 of 1000Next

Datasets

BabyLM

WHOOPS!

SRRS (Snow Removal in Realistic Scenario)

PGPS9K

M3KE (Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark)

SynthRAD2023

OxIOD (Oxford Inertial Odometry Dataset)

GlobalOpinionQA

Open-Platypus

ARAD-1K (Ntire 2022 spectral recovery challenge and data set)

MSU Video Saliency Prediction

Nutrition5k (Nutrition5k: A Comprehensive Nutrition Dataset)

Plenoptic Video Dataset

VisIT-Bench

Tennis

LIVE-itw (LIVE In the Wild Image Quality Challenge Database)

MMPD (Multi-Domain Mobile Video Physiology Dataset)

Amazon-Sports

PolyMNIST

BoLD (Body Language Dataset)

Datasets

BabyLM

WHOOPS!

SRRS (Snow Removal in Realistic Scenario)

PGPS9K

M3KE (Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark)

SynthRAD2023

OxIOD (Oxford Inertial Odometry Dataset)

GlobalOpinionQA

Open-Platypus

ARAD-1K (Ntire 2022 spectral recovery challenge and data set)

MSU Video Saliency Prediction

Nutrition5k (Nutrition5k: A Comprehensive Nutrition Dataset)

Plenoptic Video Dataset

VisIT-Bench

Tennis

LIVE-itw (LIVE In the Wild Image Quality Challenge Database)

MMPD (Multi-Domain Mobile Video Physiology Dataset)

Amazon-Sports

PolyMNIST

BoLD (Body Language Dataset)