3,275 machine learning datasets
3,275 dataset results
WeatherKITTI is currently the most realistic all-weather simulated enhancement of the KITTI dataset. The WeatherKITTI dataset simulates the three weather conditions that most affect visual perception in real-world scenarios: rain, snow, and fog. Each type of weather has two intensity levels: severe and extremely severe. Together with clear weather, these two levels create a weather-enhanced dataset featuring three levels and seven weather scenarios.
Lifespan HCP Release 2.0 includes cross-sectional visit 1 (V1) preprocessed structural and functional imaging data, unprocessed V1 imaging data for all included modalities (structural, high-res hippocampal T2, resting state fMRI, task fMRI, diffusion, and ASL), and non-imaging demographic and behavioral assessment data from 725 HCP-Aging (HCP-A, ages 36-100+) healthy participants (22+ TB of data).
UK Biobank participants have generously provided a very wide range of information about their health and well-being since recruitment began in 2006. This has been added to in the following ways:
The DAPlankton dataset consists of over 110k expert-labeled plankton images. The data is divided into two subsets: DAPlankton_LAB and DAPlankton_SEA. DAPlankton_LAB consists of images captured from multiple mono-specific phytoplankton cultures, which were analysed using three different imaging instruments: Imaging FlowCytoBot (IFCB), CytoSense (CS) flow cytometer, and FlowCam (FC) imaging microscope each producing cropped images with one plankton particle in each. An expert further verified the class of each image, ensuring that there was no cross contamination between different cultures. This process resulted in a balanced dataset with negligible label uncertainty. DAPlankton_SEA consists of images captured from water samples collected from the Baltic Sea using two different imaging instruments: IFCB and CS. Each image was manually labeled by an expert. DAPlankton_SEA provides a realistic and more challenging dataset with a large class imbalance and natural intra-class variance.
C2A: Combination to Application Dataset Overview This repository contains the code and information for the paper "UAV-Enhanced Combination to Application: Comprehensive Analysis and Benchmarking of a Human Detection Dataset for Disaster Scenarios" by Ragib Amin Nihal, Benjamin Yen, Katsutoshi Itoyama, and Kazuhiro Nakadai.
COMPASS-XP is a dataset of matched photographic and X-ray images of single objects, made available for use in Machine Learning & Computer Vision research, in particular in the context of transport security. Objects are imaged in multiple poses, and accompanied by metadata including labels for whether we consider the object to be dangerous in the context of aviation. Object classes overlap with those in the popular ImageNet Large Scale Visual Recognition Challenge class set and theWordNet lexical database, and identifiers for shared classes in both schemes are also provided.
A significant challenge in removing shadows from indoor scenes is obtaining shadow-free images. To overcome this challenge, we propose a novel rendering pipeline for generating shadowed and shadow-free images under direct and indirect illumination, and create a comprehensive synthetic dataset that contains over 30,000 image pairs, covering various object types and lighting conditions.
A high-quality synthetic dataset for object relighting. Covering a wide range of geometry and material.
A high-quality captured dataset for object relighting. Covering a wide range of geometry and material.
We introduce the Chinese Image Implication Understanding Benchmark CII-Bench, a new benchmark measuring the higher-order perceptual, reasoning and comprehension abilities of MLLMs when presented with complex Chinese implication images. These images, including abstract artworks, comics and posters, possess visual implications that require an understanding of visual details and reasoning ability. CII-Bench reveals whether current MLLMs, leveraging their inherent comprehension abilities, can accurately decode the metaphors embedded within the complex and abstract information presented in these images.
The COFAR (COmmonsense and FActual Reasoning) dataset is a collection of images and text queries specifically designed to challenge and evaluate image search models that aim to go beyond simple visual matching. It focuses on the ability of these models to perform commonsense and factual reasoning, a capability currently lacking in most existing image search technology.
The synthetic ShapeNet intrinsic image decomposition dataset used for training the deep CNN models IntrinsicNet and RetiNet of CVPR2018. See Section 4.1 of the paper for details.
I2-2000FPS is the first high-speed video dataset offering an unprecedented temporal resolution of 2000 frames per second (fps). Captured using the commercially available Chronos 1.4 high-speed CMOS camera, the dataset includes a diverse range of objects varying in size, shape, orientation, and motion, as well as various camera movements. This dataset is designed to enable research in areas such as motion analysis, object tracking, and scene understanding at extreme temporal resolutions. Potential applications span fields like sports analysis, robotics, autonomous navigation, and high-speed videography.
The researchers of Qatar University have compiled the COVID-QU-Ex dataset, which consists of 33,920 chest X-ray (CXR) images including: * 11,956 COVID-19 * 11,263 Non-COVID infections (Viral or Bacterial Pneumonia) * 10,701 Normal Ground-truth lung segmentation masks are provided for the entire dataset. This is the largest ever created lung mask dataset.
SynMirror consists of samples rendered from 3D assets of two widely used 3D object datasets - Objaverse and Amazon Berkeley Objects (ABO) placed in front of a mirror in a virtual blender environment. The total number of rendered samples are $198,204$. Each rendering contains colors, category_id_segmaps, depth, normals and cam_states.
The LeukemiaAttri dataset is a large-scale, multi-domain collection of microscopy images derived from leukemia patient samples, enriched with detailed morphological information. This dataset comprises a total of 28.9K images (2.4K × 2 × 3 × 2), which were captured using both low-cost and high-cost microscopes at three different resolutions: 10x, 40x, and 100x, utilizing various cameras. In addition to providing location annotations for each white blood cell (WBC), the dataset includes comprehensive morphological attributes for every WBC, enhancing its utility for research and analysis in the field.
BASEPROD provides comprehensive rover sensor data collected over a 1.7 km traverse, accompanied by high-resolution 2D and 3D drone maps of the terrain. The dataset also includes laser-induced breakdown spectroscopy (LIBS) measurements from key sampling sites along the rover's path, as well as weather station data to contextualize environmental conditions.
DenseUAV is a dataset of drone and satellite perspectives collected from 14 universities in low-altitude urban scenes. The main features include real scene sampling, sampling perspective perpendicular to the ground, and dense sampling. A total of 3033 sampling points, including 9099 drone perspective images and 18198 satellite perspective images.
We introduce Recognition-based Object Probing Evaluation (ROPE), an automated evaluation protocol that considers the distribution of object classes within a single image during testing and uses visual referring prompts to eliminate ambiguity. Different types of instruction settings of ROPE. In a single turn of prompting without format enforcement, we probe the model to recognize the 5 objects referred to by the visual prompts (a) one at a time in the single-object setting and (b) concurrently in the multi-object setting. We further enforce the model to follow the format template and decode only the object tokens for each of the five objects (c) without output manipulation in student forcing and (d) replacing all previously generated object tokens with the ground truth classes in teacher forcing.
Our dataset consists of over 1000 fractured frescoes. The RePAIR stands as a realistic computational challenge for methods for 2D and 3D puzzle solving, and serves as a benchmark that enables the study of fractured object reassembly and presents new challenges for geometric shape understanding. Please visit our website for more dataset information, access to source code scripts and for an interactive gallery viewing of the dataset samples.