3,275 machine learning datasets
3,275 dataset results
F-SIOL-310 is a robotic dataset and benchmark for Few-Shot Incremental Object Learning, which is used to test incremental learning capabilities for robotic vision from a few examples.
OCD (Out-of-Context Dataset) is a synthetic dataset with fine-grained control over scene context. The images are generated using a 3D simulation engine in the VirtualHome environment, which allows to control the gravity, object co-occurrences and relative sizes across 36 object categories in a virtual household environment.
Concadia is a publicly available Wikipedia-based corpus, which consists of 96,918 images with corresponding English-language descriptions, captions, and surrounding context.
The FieldSAFE dataset is a multi-modal dataset for obstacle detection in agriculture. It comprises 2 hours of raw sensor data from a tractor-mounted sensor system in a grass mowing scenario in Denmark, October 2016.
This is a 4D light-field dataset of materials. The dataset contains 12 material categories, each with 100 images taken with a Lytro Illum, from which we extract about 30,000 patches in total.
A set of realistic odd-one-out stimuli gathered "in the wild". Each image in the Odd-One-Out (O3) dataset depicts a scene with multiple objects similar to each other in appearance (distractors) and a singleton (target) distinct in one or more feature dimensions (e.g. color, shape, size). All images are resized so that the larger dimension is 1024px. Targets represent approx. 400 common object types such as flowers, sweets, chicken eggs, leaves, tiles and birds. Pixelwise masks are provided for targets and distractors. Annotations are generated using CVAT.
The ukiyo-e faces dataset comprises of 5209 images of faces from ukiyo-e prints. The images are 1024x1024 pixels in jpeg format and have been aligned using the procedure used for the FFHQ dataset
DiagSet is a histopathological dataset for prostate cancer detection. The proposed dataset consists of over 2.6 million tissue patches extracted from 430 fully annotated scans, 4675 scans with assigned binary diagnosis, and 46 scans with diagnosis given independently by a group of histopathologists.
The ChineseLP dataset contains 411 vehicle images (mostly of passenger cars) with Chinese license plates (LPs). It consists of 252 images captured by the authors and 159 images downloaded from the internet. The images present great variations in resolution (from 143 × 107 to 2048 × 1536 pixels), illumination and background.
WHU-RS19 is a set of satellite images exported from Google Earth, which provides high-resolution satellite images up to 0.5 m. Some samples of the database are displayed in the following picture. It contains 19 classes of meaningful scenes in high-resolution satellite imagery, including airport, beach, bridge, commercial, desert, farmland, footballfield, forest, industrial, meadow, mountain, park, parking, pond, port, railwaystation, residential, river, and viaduct. For each class, there are about 50 samples. It’s worth noticing that the image samples of the same class are collected from different regions in satellite images of different resolutions and then might have different scales, orientations and illuminations.
Fruits-360 dataset: A dataset of images containing fruits, vegetables, nuts and seeds Version: 2025.03.24.0 Content The following fruits, vegetables and nuts and are included: Apples (different varieties: Crimson Snow, Golden, Golden-Red, Granny Smith, Pink Lady, Red, Red Delicious), Apricot, Avocado, Avocado ripe, Banana (Yellow, Red, Lady Finger), Beans, Beetroot Red, Blackberry, Blueberry, Cabbage, Caju seed, Cactus fruit, Cantaloupe (2 varieties), Carambula, Carrot, Cauliflower, Cherimoya, Cherry (different varieties, Rainier), Cherry Wax (Yellow, Red, Black), Chestnut, Clementine, Cocos, Corn (with husk), Cucumber (ripened, regular), Dates, Eggplant, Fig, Ginger Root, Goosberry, Granadilla, Grape (Blue, Pink, White (different varieties)), Grapefruit (Pink, White), Guava, Hazelnut, Huckleberry, Kiwi, Kaki, Kohlrabi, Kumsquats, Lemon (normal, Meyer), Lime, Lychee, Mandarine, Mango (Green, Red), Mangostan, Maracuja, Melon Piel de Sapo, Mulberry, Nectarine (Regular, Flat), Nut (Fores
The MNIST Large Scale dataset is based on the classic MNIST dataset, but contains large scale variations up to a factor of 16. The motivation behind creating this dataset was to enable testing the ability of different algorithms to learn in the presence of large scale variability and specifically the ability to generalise to new scales not present in the training set over wide scale ranges.
Topo-boundary is a new benchmark dataset, named \textit{Topo-boundary}, for off-line topological road-boundary detection. The dataset contains 21,556 1000 X 1000-sized 4-channel aerial images. Each image is provided with 8 training labels for different sub-tasks.
CI-MNIST (Correlated and Imbalanced MNIST) is a variant of MNIST dataset with introduced different types of correlations between attributes, dataset features, and an artificial eligibility criterion. For an input image $x$, the label $y \in \{1, 0\}$ indicates eligibility or ineligibility, respectively, given that $x$ is even or odd. The dataset defines the background colors as the protected or sensitive attribute $s \in \{0, 1\}$, where blue denotes the unprivileged group and red denotes the privileged group. The dataset was designed in order to evaluate bias-mitigation approaches in challenging setups and be capable of controlling different dataset configurations.
The database is written in Cyrillic and shares the same 33 characters. Besides these characters, the Kazakh alphabet also contains 9 additional specific characters. This dataset is a collection of forms. The sources of all the forms in the datasets were generated by LATEX which subsequently was filled out by persons with their handwriting. The database consists of more than 1400 filled forms. There are approximately 63000 sentences, more than 715699 symbols produced by approximately 200 diferent writers. We utilized three different datasets described as following:
MultiSubs is a dataset of multilingual subtitles gathered from the OPUS OpenSubtitles dataset, which in turn was sourced from opensubtitles.org. We have supplemented some text fragments (visually salient nouns in this release) within the subtitles with web images, where the word sense of the fragment has been disambiguated using a cross-lingual approach. We have introduced a fill-in-the-blank task and a lexical translation task to demonstrate the utility of the dataset. Please refer to our paper for a more detailed description of the dataset and tasks. Multisubs will benefit research on visual grounding of words especially in the context of free-form sentence.
With complex scenes and rich annotations, the PADv2 dataset can be used as a test bed to benchmark affordance detection methods and may also facilitate downstream vision tasks, such as scene understanding, action recognition, and robot manipulation.
AutoChart is a dataset for chart-to-text generation, a task that consists on generating analytical descriptions of visual plots.
Photometrically Distorted Synthetic COCO (PDS-COCO) dataset is a synthetically created dataset for homography estimation learning. The idea is exactly the same as in the Synthetic COCO (S-COCO) dataset with SSD-like image distortion added at the beginning of the whole procedure: the first step involves adjusting the brightness of the image using randomly picked value $\delta_b \in \mathcal{U}(-32, 32)$. Next, contrast, saturation and hue noise is applied with the following values: $\delta_c \in \mathcal{U}(0.5, 1.5)$, $\delta_s \in \mathcal{U}(0.5, 1.5)$ and $\delta_h \in \mathcal{U}(-18, 18)$. Finally, the color channels of the image are randomly swapped with a probability of $0.5$. Such a photometric distortion procedure is applied to the original image independently to create source and target candidates.
AwA Pose is a large scale animal keypoint dataset with ground truth annotations for keypoint detection of quadruped animals from images.