19,997 machine learning datasets
19,997 dataset results
A new challenge domain with novel problems that arise from its combination of purely cooperative gameplay with two to five players and imperfect information.
HolStep is a dataset based on higher-order logic (HOL) proofs, for the purpose of developing new machine learning-based theorem-proving strategies.
Kuzushiji-49 is an MNIST-like dataset that has 49 classes (28x28 grayscale, 270,912 images) from 48 Hiragana characters and one Hiragana iteration mark.
LAD (Large-scale Attribute Dataset) has 78,017 images of 5 super-classes and 230 classes. The image number of LAD is larger than the sum of the four most popular attribute datasets (AwA, CUB, aP/aY and SUN). 359 attributes of visual, semantic and subjective properties are defined and annotated in instance-level.
LC-QuAD is a Large Question Answering dataset with 30,000 pairs of questions and its corresponding SPARQL query. The target knowledge base is Wikidata and DBpedia, specifically the 2018 version.
LectureBank Dataset is a manually collected dataset of lecture slides. It contains 1,352 online lecture files from 60 courses covering 5 different domains, including Natural Language Processing (nlp), Machine Learning (ml), Artificial Intelligence (ai), Deep Learning (dl) and Information Retrieval (ir). In addition, it also contains the corresponding annotations for each slide.
Multi-XScience is a large-scale dataset for multi-document summarization of scientific articles. It has 30,369, 5,066 and 5,093 samples for the train, validation and test split respectively. The average document length is 778.08 words and the average summary length is 116.44 words.
TAPOS is a new dataset developed on sport videos with manual annotations of sub-actions, and conduct a study on temporal action parsing on top. A sport activity usually consists of multiple sub-actions and that the awareness of such temporal structures is beneficial to action recognition.
Annotated using images taken by a drone in 501 separate flights, totalling in over 62 hours of trajectory data. As of today, openDD is by far the largest publicly available trajectory dataset recorded from a drone perspective, while comparable datasets span 17 hours at most.
Polyglot-NER builds massive multilingual annotators with minimal human expertise and intervention.
QMUL-SurvFace is a surveillance face recognition benchmark that contains 463,507 face images of 15,573 distinct identities captured in real-world uncooperative surveillance scenes over wide space and time.
A large scale analogue of Stanford SQuAD in the Russian language - is a valuable resource that has not been properly presented to the scientific community.
SlowFlow is an optical flow dataset collected by applying Slow Flow technique on data from a high-speed camera and analyzing the performance of the state-of-the-art in optical flow under various levels of motion blur.
SpeakingFaces is a publicly-available large-scale dataset developed to support multimodal machine learning research in contexts that utilize a combination of thermal, visual, and audio data streams; examples include human-computer interaction (HCI), biometric authentication, recognition systems, domain transfer, and speech recognition. SpeakingFaces is comprised of well-aligned high-resolution thermal and visual spectra image streams of fully-framed faces synchronized with audio recordings of each subject speaking approximately 100 imperative phrases.
Unite The People is a dataset for 3D body estimation. The images come from the Leeds Sports Pose dataset and its extended version, as well as the single person tagged people from the MPII Human Pose Dataset. The images are labeled with different types of annotations such as segmentation labels, pose or 3D.
Composed of 10,000 videos annotated with memorability scores. In contrast to previous work on image memorability -- where memorability was measured a few minutes after memorization -- memory performance is measured twice: a few minutes after memorization and again 24-72 hours later.
WikiAtomicEdits is a corpus of 43 million atomic edits across 8 languages. These edits are mined from Wikipedia edit history and consist of instances in which a human editor has inserted a single contiguous phrase into, or deleted a single contiguous phrase from, an existing sentence.
A corpus that encompasses the complete history of conversations between contributors to Wikipedia, one of the largest online collaborative communities. By recording the intermediate states of conversations---including not only comments and replies, but also their modifications, deletions and restorations---this data offers an unprecedented view of online conversation.
A challenging new benchmark for language-agnostic answer retrieval from a multilingual candidate pool.
xR-EgoPose is an egocentric synthetic dataset for egocentric 3D human pose estimation. It consists of ~380 thousand photo-realistic egocentric camera images in a variety of indoor and outdoor spaces.