3,275 machine learning datasets
3,275 dataset results
Image corruptions modelling primary optical aberrations.
TODO
The UFPR-VCR dataset contains 10,039 images of 9,502 distinct vehicles across various categories, including cars, vans, buses, and trucks. The images capture a broad spectrum of real-world conditions, such as frontal and rear views, partial occlusions, diverse lighting situations, and nighttime scenes. The dataset was designed to address Vehicle Color Recognition (VCR) in more challenging scenarios than those explored in previous studies.
A dataset of abdominal CT studies in NifTi format from the open-source medical data repository Medical Decathlon was utilized. To expedite the partitioning process, the MONAILabel plugin of the MONAI framework within the 3D Slicer program was employed. A radiologist with 15 years of experience conducted a validation process, wherein the boundaries of the colon markup were verified on each slice. The existing colorectal cancer markings in the dataset remained unaltered. Validation by a radiologist reduced the size of the validated dataset to 122 studies. In this case, the 122 studies were categorized into three subsets based on the quality of the data: The "good" subset comprises 100 studies, while the "bad" subset contains 17 cropped studies (in which the entire colon is not visible on the image). The "bad" subset comprises five studies. Two of these studies were of poor quality and could not identify the entire colon. Two further studies involved colon stomas following surgery, while
Tiny ImageNet-A is a subset of the Tiny ImageNet test set consisting of 3,374 images comprising real-world, unmodified, and naturally occurring examples that are misclassified by ResNet-18. The sampling process of Tiny ImageNet-A roughly follows the concept of ImageNet-A introduced by Hendrycks et al. ("Natural Adversarial Examples"). For further information on the sampling process visit the original paper.
EuroSAT-C is an open-source data set comprising algorithmically generated corruptions applied to the EuroSAT test set following the concept of ImageNet-C. It comprises 19 different corruptions (15 test corruptions and 4 validation corruptions) spanning 5 severity levels resulting in 108,000 images for the validation set and 405,000 images for the test set. For further information on the corruptions visit the original GitHub repository of ImageNet-C.
The National Institute of Informatics - Chiba University (NII-CU) Multispectral Aerial Person Detection Dataset consists of 5,880 pairs of aligned RGB+FIR (Far infrared) images captured from a drone flying at heights between 20 and 50 meters, with the cameras pointed at 45 degrees down. We applied lens distortion correction and a homography warping to align the thermal images with the RGB images. We then labeled the people visible on the images with rectangular bounding boxes. The footage shows a baseball field and surroundings in Chiba, Japan, recorded in January 2020.
The Swiss Drone data set was recorded around Cheseaux-sur-Lausanne in Switzerland using a senseFly eBee Classic in 2014 (SenseFly, 2020). The 100 images were captured from a top-down perspective at a flight height of approximately 80 m above the ground at a resolution of 4608 x 3456 pixels. The Okutama Drone data set was recorded and annotated by NII (Laurmaa, 2016) in 2016 using a DJI Phantom 4 at a resolution of 3840 x 2160 pixels. The 91 images were captured over Okutama, west of Tokyo, Japan, from a drone at a flight height of approximately 90 m above the ground. Here, the flight height may have varied more as Okutama is located in a narrow valley with uneven ground.
They make available a dataset of 655 places, collected by non-expert users world-wide.
Scene change detection (SCD) dataset tailored for generalizable SCD algorithm. It consists of change-labeld images from SF-XL, St Lucia, Nordland which are widely used in VPR research.
The ViCoS Towel Dataset is a state-of-the-art benchmark for grasp point localization on cloth objects, specifically towels. Designed to advance research in robotic grasping and perception for textile objects, this dataset includes a collection of 8,000 high-resolution RGB-D images (1920×1080) captured with a Kinect V2 under a variety of conditions. Each image provides detailed depth information, making it ideal for training deep learning models and conducting thorough benchmarking.
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
The JAMBO dataset contains 3290 underwater images of the seabed captured by an ROV in temperate waters in the Jammer Bay area off the North West coast of Jutland, Denmark. All the images have been annotated by six annotators to contain one of three classes: sand, stone, or bad.
The RAD (Randomly Assembled Object Construction) dataset is a synthetic 3D LEGO dataset designed for the task of Sequential Brick Assembly (SBA). Here are the key characteristics and details:
A large-scale traffic sign and traffic light dataset with accurate 3D positioning and temporally consistent 3D bounding boxes of traffic management objects from up to 200 meters away. The dataset contains additional attributes such as traffic light state, traffic light mask type, traffic sign type, and occlusion. The application areas are 3D traffic lights and sign detection for autonomous driving.
Large multimodal models (LMMs) excel in adhering to human instructions. However, self-contradictory instructions may arise due to the increasing trend of multimodal interaction and context length, which is challenging for language beginners and vulnerable populations. We introduce the Self-Contradictory Instructions benchmark to evaluate the capability of LMMs in recognizing conflicting commands. It comprises 20,000 conflicts, evenly distributed between language and vision paradigms. It is constructed by a novel automatic dataset creation framework, which expedites the process and enables us to encompass a wide range of instruction forms. Our comprehensive evaluation reveals current LMMs consistently struggle to identify multimodal instruction discordance due to a lack of self-awareness. Hence, we propose the Cognitive Awakening Prompting to inject cognition from external, largely enhancing dissonance detection.
Revealing latent structure in data is an active field of research, having introduced exciting technologies such as variational autoencoders and adversarial networks, and is essential to push machine learning towards unsupervised knowledge discovery. However, a major challenge is the lack of suitable benchmarks for an objective and quantitative evaluation of learned representations. To address this issue we introduce Morpho-MNIST, a framework that aims to answer: "to what extent has my model learned to represent specific factors of variation in the data?" We extend the popular MNIST dataset by adding a morphometric analysis enabling quantitative comparison of trained models, identification of the roles of latent variables, and characterisation of sample diversity. We further propose a set of quantifiable perturbations to assess the performance of unsupervised and supervised methods on challenging tasks such as outlier detection and domain adaptation. Data and code are available at https
Dataset to benchmark Continual Learning for Object Detection in a Tiny Robotics settings.
We introduce HSIRS, a large scale dataset of hyper-spectral images along with corresponding manually annotated segmentation maps for material characterization and classification based on spectral signature. Such data can be used to simulate any type of spectrometer and to train DNNs end-to-end for spectral reconstruction and image segmentation tasks. HSIRS features scenes containing real and fake (made of polyester, plastic or ceramic) food items with different backgrounds and scene layouts, some scenes contain also color checkers. Spectral bands are sequentially captured using a VariSpecTM tunable color filter and the scene is illuminated with 4 Halogen light sources.
GTA-UAV dataset provides a large continuous area dataset (covering 81.3km<sup>2</sup>) for UAV visual geo-localization, expanding the previously aligned drone-satellite pairs to arbitrary drone-satellite pairs to better align with real-world application scenarios. Our dataset contains: