19,997 machine learning datasets
19,997 dataset results
HatemojiCheck is a test suite for detecting emoji-based hate of 3,930 test cases covering seven functionalities of emoji-based hate and six identities.
The WikiScenes dataset consists of paired images and language descriptions capturing world landmarks and cultural sites, with associated 3D models and camera poses. WikiScenes is derived from the massive public catalog of freely-licensed crowdsourced data in the Wikimedia Commons project, which contains a large variety of images with captions and other metadata.
WDC-Dialogue is a dataset built from the Chinese social media to train EVA. Specifically, conversations from various sources are gathered and a rigorous data cleaning pipeline is designed to enforce the quality of WDC-Dialogue.
Marine Debris Turntable is a dataset for sonar perception.
The RareDis corpus contains more than 5,000 rare diseases and almost 6,000 clinical manifestations are annotated. Moreover, the Inter Annotator Agreement evaluation shows a relatively high agreement (F1-measure equal to 83.5% under exact match criteria for the entities and equal to 81.3% for the relations). Based on these results, this corpus is of high quality, supposing a significant step for the field since there is a scarcity of available corpus annotated with rare diseases.
DeepFake MNIST+ is a deepfake facial animation dataset. The dataset is generated by a SOTA image animation generator. It includes 10,000 facial animation videos in ten different actions, which can spoof the recent liveness detectors.
This is the dataset to support the paper:
The ASR-GLUE benchmark is a collection of 6 different NLU (Natural Language Understanding) tasks for evaluating the performance of models under automatic speech recognition (ASR) error across 3 different levels of background noise and 6 speakers with various voice characteristics.
SHIFT15M is a dataset that can be used to properly evaluate models in situations where the distribution of data changes between training and testing. The SHIFT15M dataset has several good properties: (i) Multiobjective. Each instance in the dataset has several numerical values that can be used as target variables. (ii) Large-scale. The SHIFT15M dataset consists of 15million fashion images. (iii) Coverage of types of dataset shifts. SHIFT15M contains multiple dataset shift problem settings (e.g., covariate shift or target shift). SHIFT15M also enables the performance evaluation of the model under various magnitudes of dataset shifts by switching the magnitude.
HeadlineCause is a dataset for detecting implicit causal relations between pairs of news headlines. The dataset includes over 5000 headline pairs from English news and over 9000 headline pairs from Russian news labeled through crowdsourcing. The pairs vary from totally unrelated or belonging to the same general topic to the ones including causation and refutation relations.
TIMo (Time-of-Flight Indoor Monitoring) is a dataset of infrared and depth videos intended for the use in Anomaly Detection and Person Detection/People Counting. It features more than 1,500 sequences for anomaly detection, which sum up to more than 500,000 individual frames. For person detection the dataset contains more than than 240 sequences. The data was captured using a Microsoft Azure Kinect RGB-D camera. In addition, we provide annotations of anomalous frame ranges for use with anomaly detection and bounding boxes and segmentation masks for use with person detection. The data was captured in parts from a tilted view and a top-down perspective.
We consider the task of identifying human actions visible in online videos. We focus on the widely spread genre of lifestyle vlogs, which consist of videos of people performing actions while verbally describing them. Our goal is to identify if actions mentioned in the speech description of a video are visually present.
BnB is a large-scale and diverse in-domain VLN (Vision and Language Navigation) dataset.
A dataset for evaluate system's understanding of given passages.
Commonsense-Dialogues is a crowdsourced dataset of ~11K dialogues grounded in social contexts involving utilization of commonsense. The social contexts used were sourced from the train split of the SocialIQA dataset, a multiple-choice question-answering based social commonsense reasoning benchmark.
BenchIE: a benchmark and evaluation framework for comprehensive evaluation of OIE systems for English, Chinese and German. In contrast to existing OIE benchmarks, BenchIE takes into account informational equivalence of extractions: our gold standard consists of fact synsets, clusters in which we exhaustively list all surface forms of the same fact.
Konzil dataset was created by specialists of the University of Greifswald. It contains manuscripts written in modern German. Train sample consists of 353 lines, validation - 29 lines and test - 87 lines.
Schiller contains handwritten texts written in modern German. Train sample consists of 244 lines, validation - 21 lines and test - 63 lines.
Ricordi contains handwritten texts written in Italian. Train sample consists of 295 lines, validation - 19 lines and test - 69 lines.
Patzig contains handwritten texts written in modern German. Train sample consists of 485 lines, validation - 38 lines and test -118 lines.