3,275 machine learning datasets
3,275 dataset results
A new underwater dataset that has been recorded in an harbor and provides several sequences with synchronized measurements from a monocular camera, a MEMS-IMU and a pressure sensor.
The AU-AIR is a multi-modal aerial dataset captured by a UAV. Having visual data, object annotations, and flight data (time, GPS, altitude, IMU sensor data, velocities), AU-AIR meets vision and robotics for UAVs.
Bangladeshi Sign Language Image Dataset (BdSLImset) is a dataset that contains images of different Bangladeshi sign letters.
Colorectal Adenoma contains 177 whole slide images (156 contain adenoma) gathered and labelled by pathologists from the Department of Pathology, The Chinese PLA General Hospital.
MVTec D2S is a benchmark for instance-aware semantic segmentation in an industrial domain. It contains 21,000 high-resolution images with pixel-wise labels of all object instances. The objects comprise groceries and everyday products from 60 categories. The benchmark is designed such that it resembles the real-world setting of an automatic checkout, inventory, or warehouse system. The training images only contain objects of a single class on a homogeneous background, while the validation and test sets are much more complex and diverse.
This is an open-source image captions dataset for the aesthetic evaluation of images. The dataset is called DPC-Captions, which contains comments of up to five aesthetic attributes of one image through knowledge transfer from a full-annotated small-scale dataset.
The Double-Sided Braille Image dataset (DSBI) is a large-scale dataset for Braille image recognition. It has detailed Braille recto dots, verso dots and Braille cells annotation.
Egoshots is a 2-month Ego-vision Dataset with Autographer Wearable Camera annotated "for free" with transfer learning. Three state of the art pre-trained image captioning models are used. The dataset represents the life of 2 interns while working at Philips Research (Netherlands) (May-July 2015) generously donating their data.
This dataset consists of 3,710 flood images, annotated by domain experts regarding their relevance with respect to three tasks (determining the flooded area, inundation depth, water pollution).
A large-scale and accurate dataset for vision-based railway traffic light detection and recognition.The recordings were made on selected running trains in France and benefited from carefully hand-labeled annotations.
A verified-by-experts repository of 3050 human rights violations photographs, labelled with human rights semantic categories, comprising a list of the types of human rights abuses encountered at present.
The Human-Parts dataset is a dataset for human body, face and hand detection with ~15k images. It contains ~106k different annotations, with multiple annotations per image.
Icons-50 is a dataset for studying surface variation robustness.
The Image-MusicEmotion-Matching-Net (IMEMNet) dataset is a dataset for continuous emotion-based image and music matching. It has over 140K image-music pairs.
A large-scale video database for rain removal (LasVR), which consists of 316 rain videos.
The MAMe dataset contains images of high-resolution and variable shape of artworks from 3 different museums:
This dataset contains 1203 individuals captured from two disjoint camera views. To each person, one to twelve images are captured from one to six different orientations under one camera view and are normalized to 128x64 pixels. This dataset is constructed based on the Market-1501 benchmark data and the orientation label for each image has been manually annotated.
ODMS is a dataset for learning Object Depth via Motion and Segmentation. ODMS training data are configurable and extensible, with each training example consisting of a series of object segmentation masks, camera movement distances, and ground truth object depth. As a benchmark evaluation, the dataset provides four ODMS validation and test sets with 15,650 examples in multiple domains, including robotics and driving.
(L)ifel(O)ng (R)obotic V(IS)ion (OpenLORIS) - Object Recognition Dataset (OpenLORIS-Object) is designed for accelerating the lifelong/continual/incremental learning research and application,currently focusing on improving the continuous learning capability of the common objects in the home scenario.
The data includes all movement trajectories extracted from the videos of Parkinson's assessments using Convolutional Pose Machines (CPM) as well as the confidence values from CPM. The dataset also includes ground truth ratings of parkinsonism and dyskinesia severity using the UDysRS, UPDRS, and CAPSIT.