3,275 machine learning datasets
3,275 dataset results
mini-ImageNet was proposed by Matching networks for one-shot learning for few-shot learning evaluation, in an attempt to have a dataset like ImageNet while requiring fewer resources. Similar to the statistics for CIFAR-100-LT with an imbalance factor of 100, we construct a long-tailed variant of mini-ImageNet that features all the 100 classes and an imbalanced training set with $N_1 = 500$ and $N_K = 5$ images. For evaluation, both the validation and test sets are balanced and contain 10K images, 100 samples for each of the 100 categories.
CAR contains visual attributes for objects in the Cityscapes dataset. For each object in an image, we have a list of attributes that depend on the category of the object. For instance, a vehicle category has a visibility attribute while a pedestrian has an activity attribute (walking, standing, etc.).
CytoImageNet is a large-scale pretraining dataset of microscopy images (890K, 894 classes). In the paper, CytoImageNet pretraining yielded features competitive to and different from ImageNet pretrained features on downstream microscopy tasks.
SuperCaustics is a simulation tool made in Unreal Engine for generating massive computer vision datasets that include transparent objects.
The dataset contains patches of facial reflectance as described in the paper, namely the diffuse albedo, diffuse normals, specular albedo, specular normals, as well as the shape in UV space. For the shape, reconstructed meshes have been registered to a common topology and the XYZ values of the points have been mapped to the RGB in UV coordinates and interpolated to complete the UV map. From the complete UV maps of 6144x4096 pixels, patches of 512x512 pixels have been sampled. The dataset contains 7500 such patches (1500 of each datatype) that are anonymized, randomized and sampled so that they do not contain identifiable features.
This work was undertaken by members of the Lincoln Centre for Autonomous Systems, University of Lincoln, UK. The four data collection sessions were conducted at three different sites in Lincolnshire, UK and one in Murcia, Spain (see Fig. 1). The sessions were conducted at the beginning and towards the end of harvesting season in UK and at the end of the harvest in Spain. The variety of broccoli plants grown in UK is called Iron Man whilst the variety grown in Spain is called Titanium.The weather during UK data capture included a mixture of different conditions including sunny, overcast and raining with broccoli varying in maturity levels from small to larger to already harvested, while the conditions for data capture in Spain included strong sunlight and mature plants at the very end of the harvesting season. The tractor was driven through the broccoli field at a slow walking speed with two rows of broccoli plants being imaged by the RGB-D sensor.
CPPE - 5 (Medical Personal Protective Equipment) is a new challenging dataset with the goal to allow the study of subordinate categorization of medical personal protective equipments, which is not possible with other popular data sets that focus on broad level categories.
In EMDS-6, there are 21 classes of environmental microorganisms (EMs). In each calss, there are 40 EM original images and their corresponding binary groud truth images. In ground truth images, the foreground is white and background is black.
The Corn Disease and Severity (CD&S) dataset consists of 511, 524, and 562, field acquired raw images, corresponding to three common foliar corn diseases, namely Northern Leaf Blight (NLB), Gray Leaf Spot (GLS), and Northern Leaf Spot.
The size of the data set is about 1GB. The data set consists of 900 image sequences of 9 gesture classes, which are defined by 3 primitive hand shapes and 3 primitive motions. Therefore, the target task for this data set is to classify different shapes as well as different motions at a time.
This dataset contains 54,987 UI screenshots and the metadata from 7,748 Android applications belonging to 25 application categories
This dataset comprises 1344 expert annotated images of muscle-tendon junctions recorded with 3 ultrasound imaging systems (Aixplorer V6, Esaote MyLab60, Telemed ArtUs), on 2 muscles (Lateral Gastrocnemius, Medial Gastrocnemius), and 2 movements (isometric maximum voluntary contractions, passive torque movements).
To provide ground truth supervision for video consistency modeling, we build up a high-quality dynamic OLAT dataset. Our capture system consists of a light stage setup with 114 LED light sources and Phantom Flex4K-GS camera (global shutter, stationary 4K ultra-high-speed camera at 1000 fps), resulting in dynamic OLAT imageset recording at 25 fps using the overlapping method. Our dynamic OLAT dataset provides sufficient semantic, temporal and lighting consistency supervision to train our neural video portrait relighting scheme, which can generalize to in-the-wild scenarios.
Iconary dataset is for testing multimodal communication with drawings and text.
This is a set of 100,000 non-overlapping image patches from hematoxylin & eosin (H&E) stained histological images of human colorectal cancer (CRC) and normal tissue. All images are 224x224 pixels (px) at 0.5 microns per pixel (MPP). For tissue classification; the classes are: Adipose (ADI), background (BACK), debris (DEB), lymphocytes (LYM), mucus (MUC), smooth muscle (MUS), normal colon mucosa (NORM), cancer-associated stroma (STR), colorectal adenocarcinoma epithelium (TUM). The images were manually extracted from N=86 H&E stained human cancer tissue slides from formalin-fixed paraffin-embedded (FFPE) samples from the NCT Biobank (National Center for Tumor Diseases, Heidelberg, Germany) and the UMM pathology archive (University Medical Center Mannheim, Mannheim, Germany). Tissue samples contained CRC primary tumor slides and tumor tissue from CRC liver metastases; normal tissue classes were augmented with non-tumorous regions from gastrectomy specimen to increase variability.
I.PHI processes the Packard Humanities Institute (PHI) database of ancient Greek inscriptions including the geographical and chronological metadata into a machine actionable format. The processed dataset is referred to as I.PHI.
The appearance of the world varies dramatically not only from place to place but also from hour to hour and month to month. Every day billions of images capture this complex relationship, many of which are associated with precise time and location metadata. We propose to use these images to construct a global-scale, dynamic map of visual appearance attributes. Such a map enables fine-grained understanding of the expected appearance at any geographic location and time. Our approach integrates dense overhead imagery with location and time metadata into a general framework capable of mapping a wide variety of visual attributes. A key feature of our approach is that it requires no manual data annotation. We demonstrate how this approach can support various applications, including image-driven mapping, image geolocalization, and metadata verification.
Object Detection data set created from the engine DeepGTAV, which is based on the video game GTAV. Part of the three data sets proposed in the paper. This data set is motivated from the VisDrone data set with almost the same classes.
Object Detection data set created from the engine DeepGTAV, which is based on the video game GTAV. Part of the three data sets proposed in the paper. This data set is motivated from the SeaDronesSee dataset with almost the same classes.
We present SILVR, a dataset of light field images for six-degrees-of-freedom navigation in large fully-immersive volumes. The SILVR dataset is short for "Synthetic Immersive Large-Volume Ray" dataset.