19,997 machine learning datasets
19,997 dataset results
Grammatical error correction dataset for text from Yahoo! Answers
VGaokao is a verification style reading comprehension dataset designed for native speakers' evaluation.
We created a building-image paired dataset that contains more than 3K samples using our roof modeling tools.
Description The Berkeley Multimodal Human Action Database (MHAD) contains 11 actions performed by 7 male and 5 female subjects in the range 23-30 years of age except for one elderly subject. All the subjects performed 5 repetitions of each action, yielding about 660 action sequences which correspond to about 82 minutes of total recording time. In addition, we have recorded a T-pose for each subject which can be used for the skeleton extraction; and the background data (with and without the chair used in some of the activities). The specified set of actions comprises of the following: (1) actions with movement in both upper and lower extremities, e.g., jumping in place, jumping jacks, throwing, etc., (2) actions with high dynamics in upper extremities, e.g., waving hands, clapping hands, etc. and (3) actions with high dynamics in lower extremities, e.g., sit down, stand up. Prior to each recording, the subjects were given instructions on what action to perform; however no specific deta
EmoCause is a dataset of annotated emotion cause words in emotional situations from the EmpatheticDialogues valid and test set. The goal is to recognize emotion cause words in sentences by training only on sentence-level emotion labels without word-level labels (i.e., weakly-supervised emotion cause recognition).
ARCA23K is a dataset of labelled sound events created to investigate real-world label noise. It contains 23,727 audio clips originating from Freesound, and each clip belongs to one of 70 classes taken from the AudioSet ontology. The dataset was created using an entirely automated process with no manual verification of the data. For this reason, many clips are expected to be labelled incorrectly.
Saint Gall dataset contains handwritten historical manuscripts written in Latin that date back to the 9th century. It consists of 60 pages, 1 410 text lines and 11 597 words.
EFO-1-QA is a new dataset to benchmark the combinatorial generalizability of Complex Query Answering (CQA) models by including 301 different queries types, which is 20 times larger than existing datasets.
BiRdQA is a bilingual multiple-choice question answering dataset with 6614 English riddles and 8751 Chinese riddles.
The GermEval dataset is a valuable resource for natural language processing (NLP) tasks, specifically named entity recognition (NER), conducted in the German language. Here are some key details about this dataset:
ERATO is a large-scale multi-modal dataset for Pairwise Emotional Relationship Recognition (PERR). It has 31,182 video clips, lasting about 203 video hours. Different from the existing datasets, ERATO contains interaction-centric videos with multi-shots, varied video length, and multiple modalities including visual, audio and text
AStitchInLanguageModels is a dataset for the exploration of idiomaticity in pre-trained language models.
MAVS is an audio-visual smartphone dataset captured in five different recent smartphones. This new dataset contains 103 subjects captured in three different sessions considering the different real-world scenarios. Three different languages are acquired in this dataset to include the problem of language dependency of the speaker recognition systems.
MuViHand is a dataset for 3D Hand Pose Estimation that consists of multi-view videos of the hand along with ground-truth 3D pose labels. The dataset includes more than 402,000 synthetic hand images available in 4,560 videos. The videos have been simultaneously captured from six different angles with complex backgrounds and random levels of dynamic lighting. The data has been captured from 10 distinct animated subjects using 12 cameras in a semi-circle topology.
Paint4Poem consists of 301 high-quality poem-painting pairs collected manually from an influential modern Chinese artist Feng Zikai.
This dataset is a benchmark of 25 open-source subjects with 21.5k builds and 3.6k failed builds that enables a fair comparison and evaluation of Test Case Prioritization (TCP) techniques. We made our data collection tools available, which can be used to extend and update the subjects. The description of the structure and files of the dataset can be also found in the documentation of the data collection tool.
A dataset for Visual Voice Activity Detection extracted from the LRS3 dataset.
OpenViDial 2.0 is a larger-scale open-domain multi-modal dialogue dataset compared to the previous version OpenViDial 1.0. OpenViDial 2.0 contains a total number of 5.6 million dialogue turns extracted from either movies or TV series from different resources, and each dialogue turn is paired with its corresponding visual context.
EDGAR-CORPUS is a novel corpus comprising annual reports from all the publicly traded companies in the US spanning a period of more than 25 years. All the reports are downloaded, split into their corresponding items (sections), and provided in a clean, easy-to-use JSON format.
Description OV dataset is the camera calibration dataset. There are 16 lenses ranging from 90° to 180° FOV: