3,275 machine learning datasets
3,275 dataset results
UNDD consists of 7125 unlabelled day and night images; additionally, it has 75 night images with pixel-level annotations having classes equivalent to Cityscapes dataset.
Atlas is a dataset for e-commerce clothing product categorization. The Atlas dataset consists of a high-quality product taxonomy dataset focusing on clothing products which contain 186,150 images under clothing category with 3 levels and 52 leaf nodes in the taxonomy.
This dataset contains Bangla handwritten numerals, basic characters and compound characters. This dataset was collected from multiple geographical location within Bangladesh and includes sample collected from a variety of aged groups. This dataset can also be used for other classification problems i.e: gender, age, district.
CC-19 is a small new dataset related to the latest family of coronavirus i.e. COVID-19. The proposed dataset “CC-19” contains 34,006 CT scan slices (images) belonging to 98 subjects out of which 28,395 CT scan slices belong to positive COVID patients.
A dataset that allows exploration of cross-modal retrieval where images contain scene-text instances.
The ECUST Food Dataset is a food recognition dataset that contains 2978 images
Goldfinch is a dataset for fine-grained recognition challenges. It contains a list of bird, butterfly, aircraft, and dog categories with relevant Google image search and Flickr search URLs. In addition, it also includes a set of active learning annotations on dog categories.
HASY is a dataset of single symbols similar to MNIST. It contains 168,233 instances of 369 classes. HASY contains two challenges: A classification challenge with 10 pre-defined folds for 10-fold cross-validation and a verification challenge.
An annotated dataset is released to enable dynamic scene classification that includes 80 hours of diverse high quality driving video data clips collected in the San Francisco Bay area. The dataset includes temporal annotations for road places, road types, weather, and road surface conditions.
iFakeFaceDB is a face image dataset for the study of synthetic face manipulation detection, comprising about 87,000 synthetic face images generated by the Style-GAN model and transformed with the GANprintR approach. All images were aligned and resized to the size of 224 x 224.
The Kenyan Food Type Dataset (KenyanFood13) is an image classification dataset for Kenyan food. The images are categorized into 13 different labels.
The PhotoSynth (PS) dataset for patch matching consists of a total of 30 scenes with 25 scenes for training and 5 scenes for validation. The different image pairs are captured in different illumination conditions, at different scales and with different viewpoints.
A new large-scale retail product dataset for fine-grained image classification. Unlike previous datasets focusing on relatively few products, more than 500,000 images of retail products on shelves were collected, belonging to 2000 different products. The dataset aims to advance the research in retail object recognition, which has massive applications such as automatic shelf auditing and image-based product information retrieval.
The San Francisco Landmark Dataset contains a database of 1.7 million images of buildings in San Francisco with ground truth labels, geotags, and calibration data, as well as a difficult query set of 803 cell phone images taken with a variety of different camera phones. The data is originally acquired by vehicle-mounted cameras with wide-angle lenses capturing spherical panoramic images. For all visible buildings in each panorama, a set of overlapping perspective images is generated.
Provides a set of stereo-rectified images and the associated groundtruthed disparities for 10 AOIs (Area of Interest) drawn from two sources: 8 AOIs from IARPA's MVS Challenge dataset and 2 AOIs from the CORE3D-Public dataset.
Contains 6016 image-pairs from the wild, shedding light upon a rich and diverse set of criteria employed by human beings.
TuSimple Lane is an extension of the TuSimple dataset with 14,336 lane boundaries annotations. Each lane boundary in the dataset is annotated using 7 different classes such as “Single Dashed”, “Double Dashed” or “Single White Continuous”.
UAV-GESTURE is a dataset for UAV control and gesture recognition. It is an outdoor recorded video dataset for UAV commanding signals with 13 gestures suitable for basic UAV navigation and command from general aircraft handling and helicopter handling signals. It contains 119 high-definition video clips consisting of 37,151 frames.
The US-4 is a dataset of Ultrasound (US) images. It is a video-based image dataset that contains over 23,000 high-resolution images from four US video sub-datasets, where two sub-datasets are newly collected by experienced doctors for this dataset.
The Vocal Folds dataset is a dataset for automatic segmentation of laryngeal endoscopic images. The dataset consists of 8 sequences from 2 patients containing 536 hand segmented in vivo colour images of the larynx during two different resection interventions with a resolution of 512x512 pixels.