3,275 machine learning datasets
3,275 dataset results
iBugMask is an in-the-wild face parsing dataset that contains 1,000 challenging face images and manually annotated labels for 11 semantic classes: background, facial skin, left/right brow, left/right eye, nose, upper/lower lip, inner mouth, and hair. The images are curated from challenging in-the-wild face alignment datasets, including 300W and Menpo. Compared with the existing face parsing datasets, iBugMask contains in-the-wild scenarios such as “party” and “conference”, which include more challenging appearance variations or multiple faces. There is a larger number of profile faces. More expressions other than ”neutral” and ”smile” are also included (e.g. ”surprise” and ”scream”). The dataset can be downloaded on here.
FSVOD-500 is a large-scale video dataset comprising of 500 classes with class-balanced videos in each category for few-shot learning. FSVOD-500 is the first benchmark specially designed for few-shot video object detection for evaluating the performance of a given model on novel classes.
Kvasir-Capsule dataset is the largest publicly released VCE dataset. In total, the dataset contains 47,238 labeled images and 117 videos, where it captures anatomical landmarks and pathological and normal findings. The results is more than 4,741,621 images and video frames altogether.
The Caltech Cars dataset consists of 126 rear-view photographs captured within parking lots. These images possess a resolution of 896 × 592 pixels, featuring a solitary vehicle as the primary subject. The acquisitions were made during daylight hours employing a handheld camera at roughly equivalent distances for all instances.
This dataset contains 2,000 dial meter images obtained on-site by employees of the Energy Company of Paraná (Copel), which serves more than 4 million consuming units in the Brazilian state of Paraná. The images were acquired with many different cameras and are available in the JPG format with 320×640 or 640×320 pixels (depending on the camera orientation).
TabLeX is a large-scale benchmark dataset comprising table images generated from scientific articles. TabLeX consists of two subsets, one for table structure extraction and the other for table content extraction. Each table image is accompanied by its corresponding LATEX source code. To facilitate the development of robust table IE tools, TabLeX contains images in different aspect ratios and in a variety of fonts.
Global Symmetry Ground-truth for AVA dataset.
The fetoscopy placenta dataset is associated with our MICCAI2020 publication titled “Deep Placental Vessel Segmentation for Fetoscopic Mosaicking”. The dataset contains 483 frames with ground-truth vessel segmentation annotations taken from six different in vivo fetoscopic procedure videos. The dataset also includes six unannotated in vivo continuous fetoscopic video clips (950 frames) with predicted vessel segmentation maps obtained from the leave-one-out cross-validation of our method.
ReactionGIF is an affective dataset of 30K tweets which can be used for tasks like induced sentiment prediction and multilabel classification of induced emotions.
EPISURG is a clinical dataset of $T_1$-weighted magnetic resonance images (MRI) from 430 epileptic patients who underwent resective brain surgery at the National Hospital of Neurology and Neurosurgery (Queen Square, London, United Kingdom) between 1990 and 2018.
The RISE (Robust Indoor Localization in Complex Scenarios) dataset is meant to train and evaluate visual indoor place recognizers. It contains more than 1 million geo-referenced images spread over 30 sequences, covering 5 heterogeneous buildings. For each building we provide: - A high resolution 3D point cloud (1cm) that defines the localization reference frame and that was generated with a mobile laser scanner and an inertial system. - Several image sequences spread over time with accurate ground truth poses retrieved by the laser scanner. Each sequence contains both, stereo pairs and spherical images. - Geo-referenced smartphone data, retrieved from the standard sensors of such devices.
UW-IS (UW Indoor Scenes) is a dataset for object recognition in indoor environments comprising scene images from two different environments, namely, a living room and a mock warehouse.
This mouse cerebellar atlas can be used for mouse cerebellar morphometry.
The Herbarium Half-Earth dataset is a large and diverse dataset of herbarium specimens to date for automatic taxon recognition. The Herbarium 2021: Half-Earth Challenge dataset includes more than 2.5M images representing nearly 65,000 species from the Americas and Oceania that have been aligned to a standardized plant list.
Rent3D++ is an extension of the Rent3D floorplans + photos dataset. The floorplans are annotated with room outline polygons, doors/windows as line segments, object-icons as axis-aligned bounding boxes, room-door-room connectivity graphs, and photo-room assignments. We have extracted rectified surface crops from architectural surfaces in photos, and these can drive interior texturing/material modeling tasks. This dataset can be used with our paper Plan2Scene to generate textured 3D mesh models of houses using floorplans and photos.
Thumb Index 1000 (TI1K) is a dataset of 1000 hand images with the hand bounding box, and thumb and index fingertip positions. The dataset includes the natural movement of the thumb and index fingers making it suitable for mixed reality (MR) applications.
Ambiguous-HOI is a challenging dataset containing ambiguous human-object interaction images for HOI detection based on HICO-DET.
We present TNCR, a new table dataset with varying image quality collected from free open source websites. TNCR dataset can be used for table detection in scanned document images and their classification into 5 different classes.
This dataset is a composition of scenes taken by SPOT sensor in 2005 over four counties in the State of Minas Gerais, Brazil: Arceburgo, Guaranesia, Guaxupé and Monte Santo. It has multispectral high-resolution scenes of coffee crops and non-coffee areas. It has many intraclass variance caused by different crop management technique, as well as scenes with different plant ages and/or with spectral distortions caused by shadows.