3,275 machine learning datasets
3,275 dataset results
HR-Crime is a subset of the UCF-Crime dataset suitable for human-related anomaly detection tasks.
Multispectral and HD vineyard orthomosaics from central Portugal
A dataset of images containing leaves from 15 tree classes.
Bosch Industrial Depth Completion Dataset (BIDCD) is an RGBD dataset for of static table-top scenes with industrial objects. The data was collected with a RealSense depth-camera mounted on a robotic arm, i.e. from multiple Points-of-View (POV), approximately 60 for each scene. We generated depth ground truth with a customized pipeline for removing erroneous depth values, and applied Multi-View geometry to fuse the cleaned depth frames and fill-in missing information. The fused scene mesh was back-projected to each POV, and finally a bi-lateral filter was applied to reduce the remaining holes.
A high-resolution multi-sensor remote sensing scene classification dataset, appropriate for training and evaluating image classification models in the remote sensing domain.
Steredo Waterdrop is a real-world dataset for research on stereo waterdrop removal. The dataset contains 837 stereo image pairs captured from 129 indoor and outdoor scenes with various waterdrops, disparities, and illumination conditions. We use the ZED 2 stereo camera for data collection.
WikiChurches is a dataset for architectural style classification, consisting of 9,485 images of church buildings. Both images and style labels were sourced from Wikipedia. The dataset can serve as a benchmark for various research fields, as it combines numerous real-world challenges: fine-grained distinctions between classes based on subtle visual features, a comparatively small sample size, a highly imbalanced class distribution, a high variance of viewpoints, and a hierarchical organization of labels, where only some images are labeled at the most precise level.
MHMD (Modern Historical Movies Dataset) is a dataset for old image colorization, built from historical movies. It consists of 1,353,166 images and 42 labels of eras, nationalities, and garment types for automatic colorization from 147 historical movies or TV series made in modern time.
STN PLAD is a high-resolution and real-world image dataset of multiple high-voltage power line components. It has 2,409 annotated objects divided into five classes: transmission tower, insulator, spacer, tower plate, and Stockbridge damper, which vary in size (resolution), orientation, illumination, angulation, and background.
P. vivax (malaria) infected human blood smears with bounding box annotations. The data consists of two classes of uninfected cells (RBCs and leukocytes) and four classes of infected cells (gametocytes, rings, trophozoites, and schizonts).
A new data consolidation called Multi-Attributed and Structured Text-to-face (MAST) dataset. The motivation is to have a large corpus of high-quality face images with fine-grained and attribute-focussed annotations. This has the benefits of the attribute oriented approach as well as the semantics in a textual description.
Open Set Logo Detection Dataset (OSLD Dataset) is a dataset of eCommerce product images with associated brand logo images. It is released under creative commons (CC BY-NC 4.0) license to promote research in open set logo detection. The dataset can be used only for research purposes. The dataset contains:
N15News is a large-scale multimodal news dataset comprising 200K imagetext pairs and 15 categories, which exceeding the previous news dataset in both the number of categories and samples.
PhotoMatte85 contains 85 protrait images. The dataset is donated to us by a third-party commercial company. The footage are shot with professional studio lighting and the subjects are in standard portrait posing. We provide the alpha matte and foreground images extracted from the green screen photos. Due to license issue, we will not release the other 13K images used in training.
The sRGB2XYZ dataset contains ~1,200 pairs of camera-rendered sRGB and the corresponding scene-referred CIE XYZ images (971 training, 50 validation, and 244 testing images).
FunKPoint is a dataset for finding correspondences in visual data that has ground truth correspondences for 10 tasks and 20 object categories.
The MFH dataset is a multi-viewpoint fine-grained hand hygiene dataset. It contains 73,1147 samples in total, which are collected by 6 camera views in 6 different locations. All samples are split into 7 classes in total. MFH dataset is distinguished from existing datasets in three aspects: the large intra-class difference, the subtle inter-class difference, and the data mismatch in distribution between the training phase and the inference phase. This dataset thus provides a more realistic benchmark.
Indiscapes2, a new large-scale diverse dataset of Indic manuscripts with semantic layout annotations. Indiscapes2 contains documents from four different historical collections and is 150% larger than its predecessor, Indiscapes.
YorkTag provides pairs of sharp/blurred images containing fiducial markers and is proposed to train and qualitatively and quantitatively evaluate our model.
The dataset consists of 53,189 wikiHow articles across various categories of everyday tasks, 155,265 methods, and 772,294 steps with corresponding images.