3,275 machine learning datasets
3,275 dataset results
iWildCam 2021 is a dataset for counting the number of animals of each species that appear in sequences of images captured with camera traps. The training data and test data are from different cameras spread across the globe. The set of species seen in each camera overlap but are not identical. The challenge is to categorize species and count the number of individuals across image bursts.
DirtyMNIST is a concatenation of MNIST + AmbiguousMNIST, with 60k samples each in the training set. AmbiguousMNIST contains additional ambiguous digits with varying ambiguity. The AmbiguousMNIST test set contains 60k ambiguous samples as well.
Fetoscopic Placental Vessel Segmentation and Registration (FetReg) is a large-scale multi-centre dataset for the development of generalized and robust semantic segmentation and video mosaicking algorithms for the fetal environment with a focus on creating drift-free mosaics from long duration fetoscopy videos.
This image set is part of a high-throughput chemical screen on U2OS cells, with examples of 200 bioactive compounds. The effect of the treatments was originally imaged using the Cell Painting assay (fluorescence microscopy). This data set only includes the DNA channel of a single field of view per compound. These images present a variety of nuclear phenotypes, representative of high-throughput chemical perturbations. The main use of this data set is the study of segmentation algorithms that can separate individual nucleus instances in an accurate way, regardless of their shape and cell density. The collection has around 23,000 single nuclei manually annotated to establish a ground truth collection for segmentation evaluation.
Imgur5k is a large-scale handwritten in-the-wild dataset, containing challenging real world handwritten samples from nearly 5K writers. It consists of ~135K handwritten English words from 5K different images. As opposed to existing dataests for OCR which have limited variability in their images, the images in Imgur5K contain a diverse set of styles.
Fishnet Open Images Database is a large dataset of EM imagery for fish detection and fine-grained categorisation onboard commercial fishing vessels. The dataset consists of 86,029 images containing 34 object classes, making it the largest and most diverse public dataset of fisheries EM imagery to-date. It includes many of the characteristic challenges of EM data: visual similarity between species, skewed class distributions, harsh weather conditions, and chaotic crew activity.
23,000 cropped images of tree bark, for 23 species of trees around Quebec City, Canada. The images were captured at a distance between 20-60 cm away from the trunk. Labels include: individual tree ID, its species, and its DBH (diameter at breast height). Pictures were taken with four different devices: Nexus 5, Samsung Galaxy S5, Samsung Galaxy S7, and a Panasonic Lumix DMC-TS5 camera. The dataset is sufficiently large to train a Deep network such as ResNet for species recognition.
The Forms Dataset is a dataset for document structure extraction comprising of 5K forms.
Vehicle-Rear is a novel dataset for vehicle identification that contains more than three hours of high-resolution videos, with accurate information about the make, model, color and year of nearly 3,000 vehicles, in addition to the position and identification of their license plates.
The Action-Camera Parking Dataset contains 293 images captured at a roughly 10-meter height using a GoPro Hero 6 camera. It can be used for training machine learning models that perform image-based parking space occupancy classification.
Replay data from human players and AI agents navigating in a 3D game environment.
A dataset for 2D pose estimation of anime/manga images.
FoodLogoDet-1500 is a new large-scale publicly available food logo dataset, which has 1,500 categories, about 100,000 images and about 150,000 manually annotated food logo objects.
The WikiScenes dataset consists of paired images and language descriptions capturing world landmarks and cultural sites, with associated 3D models and camera poses. WikiScenes is derived from the massive public catalog of freely-licensed crowdsourced data in the Wikimedia Commons project, which contains a large variety of images with captions and other metadata.
SHIFT15M is a dataset that can be used to properly evaluate models in situations where the distribution of data changes between training and testing. The SHIFT15M dataset has several good properties: (i) Multiobjective. Each instance in the dataset has several numerical values that can be used as target variables. (ii) Large-scale. The SHIFT15M dataset consists of 15million fashion images. (iii) Coverage of types of dataset shifts. SHIFT15M contains multiple dataset shift problem settings (e.g., covariate shift or target shift). SHIFT15M also enables the performance evaluation of the model under various magnitudes of dataset shifts by switching the magnitude.
BnB is a large-scale and diverse in-domain VLN (Vision and Language Navigation) dataset.
Konzil dataset was created by specialists of the University of Greifswald. It contains manuscripts written in modern German. Train sample consists of 353 lines, validation - 29 lines and test - 87 lines.
Schiller contains handwritten texts written in modern German. Train sample consists of 244 lines, validation - 21 lines and test - 63 lines.
Ricordi contains handwritten texts written in Italian. Train sample consists of 295 lines, validation - 19 lines and test - 69 lines.
Patzig contains handwritten texts written in modern German. Train sample consists of 485 lines, validation - 38 lines and test -118 lines.