19,997 machine learning datasets
19,997 dataset results
DENSE (Depth Estimation oN Synthetic Events) is a new dataset with synthetic events and perfect ground truth.
CMRC 2018 is a dataset for Chinese Machine Reading Comprehension. Specifically, it is a span-extraction reading comprehension dataset that is similar to SQuAD.
DVQA is a synthetic question-answering dataset on images of bar-charts.
Multilingual Knowledge Questions and Answers (MKQA) is an open-domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages (260k question-answer pairs in total). The goal of this dataset is to provide a challenging benchmark for question answering quality across a wide set of languages. Answers are based on a language-independent data representation, making results comparable across languages and independent of language-specific passages. With 26 languages, this dataset supplies the widest range of languages to-date for evaluating question answering.
A new large-scale benchmark consisting of both synthetic and real-world hazy images, called REalistic Single Image DEhazing (RESIDE). RESIDE highlights diverse data sources and image contents, and is divided into five subsets, each serving different training or evaluation purposes.
TAO is a federated dataset for Tracking Any Object, containing 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average. A bottom-up approach was used for discovering a large vocabulary of 833 categories, an order of magnitude more than prior tracking benchmarks.
WildDeepfake is a dataset for real-world deepfakes detection which consists of 7,314 face sequences extracted from 707 deepfake videos that are collected completely from the internet. WildDeepfake is a small dataset that can be used, in addition to existing datasets, to develop more effective detectors against real-world deepfakes.
The inD dataset is a new dataset of naturalistic vehicle trajectories recorded at German intersections. Using a drone, typical limitations of established traffic data collection methods like occlusions are overcome. Traffic was recorded at four different locations. The trajectory for each road user and its type is extracted. Using state-of-the-art computer vision algorithms, the positional error is typically less than 10 centimetres. The dataset is applicable on many tasks such as road user prediction, driver modeling, scenario-based safety validation of automated driving systems or data-driven development of HAD system components.
MMKG is a collection of three knowledge graphs for link prediction and entity matching research. Contrary to other knowledge graph datasets, these knowledge graphs contain both numerical features and images for all entities as well as entity alignments between pairs of KGs. While MMKG is intended to perform relational reasoning across different entities and images, previous resources are intended to perform visual reasoning within the same image.
The DukeMTMC-VideoReID (Duke Multi-Tracking Multi-Camera Video-based ReIDentification) dataset is a subset of the DukeMTMC for video-based person re-ID. The dataset is created from high-resolution videos from 8 different cameras. It is one of the largest pedestrian video datasets wherein images are cropped by hand-drawn bounding boxes. The dataset consists 4832 tracklets of 1812 identities in total, and each tracklet has 168 frames on average.
CoMplex video Object SEgmentation (MOSE) is a dataset to study the tracking and segmenting objects in complex environments. MOSE contains 2,149 video clips and 5,200 objects from 36 categories, with 431,725 high-quality object segmentation masks. The most notable feature of MOSE dataset is complex scenes with crowded and occluded objects.
The OPUS-MT benchmark is a systematic collection of results from these models, focusing on verifiable translation performance and large coverage in terms of languages and domains. The OPUS-MT Dashboard is a web-based platform that provides a comprehensive overview of these open translation models. It includes summaries of benchmarks for over 2,300 models covering 4,560 language directions and 294 languages. The aim is to centralize, reproduce, and cover MT evaluation combined with scalability.
Roman-empire is a word dependency graph based on the Roman Empire article from the English Wikipedia.
In Clipart1k, the target domain classes to be detected are the same as those in the source domain. All the images for a clipart domain were collected from one dataset (i.e., CMPlaces) and two image search engines (i.e., Openclipart2 and Pixabay3). Search queries used are 205 scene classes (e.g., pasture) used in CMPlaces to collect various objects and scenes with complex backgrounds.
The TextbookQuestionAnswering (TQA) dataset is drawn from middle school science curricula. It consists of 1,076 lessons from Life Science, Earth Science and Physical Science textbooks. This includes 26,260 questions, including 12,567 that have an accompanying diagram.
A new challenge set for multimodal classification, focusing on detecting hate speech in multimodal memes.
The KIT Motion-Language is a dataset linking human motion and natural language.
MedMentions is a new manually annotated resource for the recognition of biomedical concepts. What distinguishes MedMentions from other annotated biomedical corpora is its size (over 4,000 abstracts and over 350,000 linked mentions), as well as the size of the concept ontology (over 3 million concepts from UMLS 2017) and its broad coverage of biomedical disciplines.
The Objectron dataset is a collection of short, object-centric video clips, which are accompanied by AR session metadata that includes camera poses, sparse point-clouds and characterization of the planar surfaces in the surrounding environment. In each video, the camera moves around the object, capturing it from different angles. The data also contain manually annotated 3D bounding boxes for each object, which describe the object’s position, orientation, and dimensions. The dataset consists of 15K annotated video clips supplemented with over 4M annotated images in the following categories: bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes. To ensure geo-diversity, the dataset is collected from 10 countries across five continents.
3D-FUTURE (3D FUrniture shape with TextURE) is a 3D dataset that contains 20,240 photo-realistic synthetic images captured in 5,000 diverse scenes, and 9,992 involved unique industrial 3D CAD shapes of furniture with high-resolution informative textures developed by professional designers.