3,275 machine learning datasets
3,275 dataset results
Description: 23 Pairs of Identical Twins Face Image Data. The collecting scenes includes indoor and outdoor scenes. The subjects are Chinese males and females. The data diversity inlcudes multiple face angles, multiple face postures, close-up of eyes, multiple light conditions and multiple age groups. This dataset can be used for tasks such as twins' face recognition.
This dataset inclue multi-spectral acquisition of vegetation for the conception of new DeepIndices. The images were acquired with the Airphen (Hyphen, Avignon, France) six-band multi-spectral camera configured using the 450/570/675/710/730/850 nm bands with a 10 nm FWHM. The dataset were acquired on the site of INRAe in Montoldre (Allier, France, at 46°20'30.3"N 3°26'03.6"E) within the framework of the “RoSE challenge” founded by the French National Research Agency (ANR) and in Dijon (Burgundy, France, at 47°18'32.5"N 5°04'01.8"E) within the site of AgroSup Dijon. Images of bean and corn, containing various natural weeds (yarrows, amaranth, geranium, plantago, etc) and sowed ones (mustards, goosefoots, mayweed and ryegrass) with very distinct characteristics in terms of illumination (shadow, morning, evening, full sun, cloudy, rain, ...) were acquired in top-down view at 1.8 meter from the ground. (2020-05-01)
Medical VQA dataset built from the IDRiD and eOphta datasets. The dataset contains both healthy and unhealthy fundus images. For each image, a set of pre-defined questions is generated, including questions about regions (e.g. are there hard exudates in this region?), for which an associated mask denotes the location of the region.
Inpainting networks are typically benchmarked on samples from Places2 dataset. However, this dataset does not have high resolution images for evaluation purposes. Instead, we will use images from the Unsplash-Lite Dataset, which contains 25k high resolution nature-themed photos. We randomly sampled 1000 images from the dataset. Each image is resized and cropped to 1024x1024, and a set of masks is generated with thin, medium, and thick brush strokes, using the methodology described in LaMa.
The Synthetic Visual Reasoning Test (SVRT) is a series of 23 classification problems involving images of randomly generated shapes.
In this Adjudicator Scores_Short Stories and Written Reflections folder: Four files from four student participants of the contest. Each file contains
In this Pre-Contest Workshop Slidedeck.pdf: Instructional materials delivered for the seven pre-contest workshops
Highlights
MatriVasha the largest dataset of handwritten Bangla compound characters for research on handwritten Bangla compound character recognition. The proposed dataset contains 120 different types of compound characters that consist of 306,464 images written where 152,950 male and 153,514 female handwritten Bangla compound characters. This dataset can be used for other issues such as gender, age, district base handwriting research because the sample was collected that included district authenticity, age group, and an equal number of men and women.
This dataset concentrates on the activities of the crowd for a fine-grained image classification task, named as Crowd Activity dataset, as automatically understanding crowd activity is meaningful for social security. This dataset is newly collected, where the images are mainly searched on the Internet and collected from streets by mobile phones. All images in this dataset contain at least one text instance. The categories come from activities of daily living and demonstrations stimulated by hot events in recent years. Specifically, this dataset consists of 21 categories and 8785 images in total. The 21 categories broadly fall into two types: activities of daily living(i.e., celebrating Christmas, holding sport meeting, holding concert, celebrating birthday party, celebrity speech, teaching, graduation ceremony, picnic, press briefing, shopping, celebrating Thanks giving day) and demonstrations (i.e., protecting animals, protecting environment, appealing for peace, Brexit, COVID-19, ele
Loucount is a retail object detection and and counting dataset with rich annotations in retail stores, which consists of 50, 394 images with more than 1.9 million object instances in 140 categories
Despite its importance for assessing the effectiveness of communicating information visually, fine-grained recallability of information visualisations has not been studied quantitatively so far.
MNIST Multiview Datasets MNIST is a publicly available dataset consisting of 70, 000 images of handwritten digits distributed over ten classes. We generated 2 four-view datasets where each view is a vector of R<sup>14 x 14</sup>:
Manually vAlidated Vq2a Examples fRom Image/Caption datasetS (MAVERICS) is a suite of test-only visual question answering datasets.
The Western Mediterranean Wetlands Bird Dataset is a collection of birds' vocalizations of different lengths that primarily consists of 5,795 labelled audio clips derived from 1,098 recordings, totalling 201.6 minutes or 12,096 seconds alongside with corresponding annotations. It also comes with Mel spectrogram version of the data, where an image represents a 1-second window of the original audio, resulting in a total of 17,536 spectrographic images. These are stored in matrix form within .npy files. These are the species covered:
Sequence Consistency Evaluation (SCE) consists of a benchmark task for sequence consistency evaluation (SCE).
This dataset contains 369 images of Trash used for deep learning. Each image is manually labelled by our team for accurate detections making a total of 470 bounding boxes. There are total 4 classes {(0: glass), (1:paper), (2:metal), (3:plastic)}
This dataset consists of ~350k JPEG images of streetlight columns installed on a public road infrastructure located in the city of Bristol, UK.
We provide a custom synthetic bimodal dataset, called GeBiD, designed specifically for the comparison of the joint- and cross-generative capabilities of Multimodal Variational Autoencoders. It comprises RGB images of geometric primitives and textual descriptions. The dataset offers 5 levels of difficulty (based on the number of attributes) to find the minimal functioning scenario for each model. Moreover, its rigid structure enables automatic qualitative evaluation of the generated samples.
STDW is a diverse large-scale dataset for table detection with more than seven thousand samples containing a wide variety of table structures collected from many diverse sources.