19,997 machine learning datasets
19,997 dataset results
Head and Neck Tumor Segmentation
The signing is recorded by a stationary color camera placed in front of the sign language interpreters. Interpreters wear dark clothes in front of an artificial grey background with color transition. All recorded videos are at 25 frames per second and the size of the frames is 210 by 260 pixels. Each frame shows the interpreter box only.
Details It contains 250 outdoor images of 1280$\times$720 pixels each. These images have been carefully annotated by experts on the computer vision field, hence no redundancy has been considered. In spite of that, all results have been cross-checked several times in order to correct possible mistakes or wrong edges by just one subject. This dataset is publicly available as a benchmark for evaluating edge detection algorithms. The generation of this dataset is motivated by the lack of edge detection datasets, actually, there is just one dataset publicly available for the edge detection task published in 2016 (MDBD: Multicue Dataset for Boundary Detection—the subset for edge detection). The level of details of the edge level annotations in the BIPED’s images can be appreciated looking at the GT, see Figs above.
CrisisMMD is a large multi-modal dataset collected from Twitter during different natural disasters. It consists of several thousands of manually annotated tweets and images collected during seven major natural disasters including earthquakes, hurricanes, wildfires, and floods that happened in the year 2017 across different parts of the World. The provided datasets include three types of annotations.
The Yahoo! Learning to Rank Challenge dataset consists of 709,877 documents encoded in 700 features and sampled from query logs of the Yahoo! search engine, spanning 29,921 queries.
CH-SIMS is a Chinese single- and multimodal sentiment analysis dataset which contains 2,281 refined video segments in the wild with both multimodal and independent unimodal annotations. It allows researchers to study the interaction between modalities or use independent unimodal annotations for unimodal sentiment analysis.
CrossWOZ is the first large-scale Chinese Cross-Domain Wizard-of-Oz task-oriented dataset. It contains 6K dialogue sessions and 102K utterances for 5 domains, including hotel, restaurant, attraction, metro, and taxi. Moreover, the corpus contains rich annotation of dialogue states and dialogue acts at both user and system sides.
A large scale dataset to enable the transfer step, exploiting the Natural Questions dataset.
A three million frame, multi-view, furniture assembly video dataset that includes depth, atomic actions, object segmentation, and human pose.
A parallel corpus of over 300 languages with around 100 thousand parallel sentences per language pair on average.
TinyPerson is a benchmark for tiny object detection in a long distance and with massive backgrounds. The images in TinyPerson are collected from the Internet. First, videos with a high resolution are collected from different websites. Second, images from the video are sampled every 50 frames. Then images with a certain repetition (homogeneity) are deleted, and the resulting images are annotated with 72,651 objects with bounding boxes by hand.
Retinal OCTA SEgmentation dataset (ROSE) consists of 229 OCTA images with vessel annotations at either centerline-level or pixel level.
Benchmarking Denoising Algorithms with Real Photographs
pathbased is a 3-cluster data set. The data set consists of a circular cluster with an opening near the bottom and two Gaussian distributed clusters inside. Each cluster contains 100 data points.
The MULTEXT-East resources are a multilingual dataset for language engineering research and development. It consists of the (1) MULTEXT-East morphosyntactic specifications, defining categories (parts-of-speech), their morphosyntactic features (attributes and values), and the compact MSD tagset representations; (2) morphosyntactic lexica, (3) the annotated parallel "1984" corpus; and (4) some comparable text and speech corpora. The specifications are available for the following macrolanguages, languages and language varieties: Albanian, Bulgarian, Chechen, Czech, Damaskini, English, Estonian, Hungarian, Macedonian, Persian, Polish, Resian, Romanian, Russian, Serbo-Croatian, Slovak, Slovene, Torlak, and Ukrainian, while the other resources are available for a subset of these languages.
The Behance Artistic Media dataset (BAM!) is a large-scale dataset of contemporary artwork from Behance, a website containing millions of portfolios from professional and commercial artists. We annotate Behance imagery with rich attribute labels for content, emotions, and artistic media. We believe our Behance Artistic Media dataset will be a good starting point for researchers wishing to study artistic imagery and relevant problems.
This is a dataset for a video super-resolution task. The dataset contains the most complex content for the restoration task: faces, text, QR-codes, car numbers, unpatterned textures, small details. Videos include different types of motion and different types of degradation: bicubic interpolation (BI) and Gaussian blurring and downsampling (BD). The resolution of all input video sequences is 480x320. Source: https://videoprocessing.ai/benchmarks/video-super-resolution.html Image Source: https://videoprocessing.ai/benchmarks/video-super-resolution.html
Chest ImaGenome is a dataset with a scene graph data structure to describe 242,072 images. Local annotations are automatically produced using a joint rule-based natural language processing (NLP) and atlas-based bounding box detection pipeline. Through a radiologist constructed CXR ontology, the annotations for each CXR are connected as an anatomy-centered scene graph, useful for image-level reasoning and multimodal fusion applications. Overall, the following are provided: i) 1256 combinations of relation annotations between 29 CXR anatomical locations (objects with bounding box coordinates) and their attributes, structured as a scene graph per image, ii) over 670,000 localized comparison relations (for improved, worsened, or no change) between the anatomical locations across sequential exams, as well as ii) a manually annotated gold standard scene graph dataset from 500 unique patients.
ZESHEL is a zero-shot entity linking dataset, which places more emphasis on understanding the unstructured descriptions of entities to resolve the ambiguity of mentions on four unseen domains.
PeMS07 is a traffic forecasting benchmark.