3,275 machine learning datasets
3,275 dataset results
Unpaired dataset: The dataset is built by ourselves, and there are all real haze images from websites.
The dataset contains procedurally generated images of transparent vessels containing liquid and objects . The data for each image includes segmentation maps, 3d depth maps, and normal maps of of the liquid or object inside the transparent vessel, and the vessel. In addition, the properties of the materials inside the containers are given(color/transparency/roughness/metalness). In addition, a natural image benchmark for the 3d/depth estimation of objects inside transparent containers is supplied. 3d models of the objects (GTLF) are also supplied.
Raw-Microscopy:
RealHDRTV dataset is the first real-world paired SDRTV-HDRTV dataset, which includes SDRTV-HDRTV pairs with 8K resolutions captured by a smartphone camera with the “SDR” and “HDR10” modes. To avoid possible misalignment, a professional steady tripod is used and only captured indoor or in controlled static scenes. After the acquisition, regions are cut out with obvious motions (10+ pixels) and light condition changes, and are cropped into 4K image pairs and a global 2D translation is used to align the cropped image pairs. Then, the pairs are removed which are still with obvious misalignment and get final 4K SDRTV-HDRTV pairs with misalignment no more than 1 pixel as labeled inference dataset.
This data set contains over 600GB of multimodal data from a Mars analog mission, including accurate 6DoF outdoor ground truth, indoor-outdoor transitions with continuous cross-domain ground truth, and indoor data with Optitrack measurements as ground truth. With 26 flights and a combined distance of 2.5km, this data set provides you with various distinct challenges for testing and proofing your algorithms. The UAV carries 18 sensors, including a high-resolution navigation camera and a stereo camera with an overlapping field of view, two RTK GNSS sensors with centimeter accuracy, as well as three IMUs, placed at strategic locations: Hardware dampened at the center, off-center with a lever arm, and a 1kHz IMU rigidly attached to the UAV (in case you want to work with unfiltered data). The sensors are fully pre-calibrated, and the data set is ready to use. However, if you want to use your own calibration algorithms, then the raw calibration data is also ready for download. The cross-domai
Halpe-FullBody is a full body keypoints dataset where each person has annotated 136 keypoints, including 20 for body, 6 for feet, 42 for hands and 68 for face. It is designed for the task of whole body human pose estimation.
Visual Commonsense Immorality benchmark is a benchmark designed to evaluate commonsense immorality. It contains 2,172 immoral images for general and extensive immoral image detection.
AtyPict is a dataset of atypical sketch content designed for atypical sketch content detection tasks.
Unique radiogenomic dataset from a Non-Small Cell Lung Cancer (NSCLC) cohort of 211 subjects. The dataset comprises Computed Tomography (CT), Positron Emission Tomography (PET)/CT images, semantic annotations of the tumors as observed on the medical images using a controlled vocabulary, segmentation maps of tumors in the CT scans, and quantitative values obtained from the PET/CT scans. Imaging data are also paired with gene mutation, RNA sequencing data from samples of surgically excised tumor tissue, and clinical data, including survival outcomes.
The Ambiguous VQA dataset is a dataset of ambiguous questions about images. It consists of a set of ambiguous images and their answers. It is used to train and evaluate question generation models in English.
Marine Microalgae Detection in Microscopy Images dataset contains a total number of images in the dataset is 937 and all the objects in these images were annotated. The total number of annotated objects is 4201. The training set contains 537 images and the testing set contains 430 images.
This dataset contains around 218K sentences, with 1.5 million words, from 30 different books designed for Post-OCR text correction.
MM-Locate-News is a dataset for location estimation of news. It consists of 6395 news articles covering 237 cities and 152 countries across all continents as well as multiple domains such as health, environment, and politics. The dataset is collected in a weakly-supervised manner, and multiple data cleaning steps are applied to remove articles with potential inaccurate geolocation information. The acquired dataset addresses drawbacks of other datasets such as BreakingNews as it considers multimodal content of news to label the corresponding location.
Stack of 2D gray images of glass fiber-reinforced polyamide 66 (GF-PA66) 3D X-ray Computed Tomography (XCT) specimen.
Chinese Character Stroke Extraction (CCSE) is a benchmark containing two large-scale datasets: Kaiti CCSE (CCSE-Kai) and Handwritten CCSE (CCSE-HW). It is designed for stroke extraction problems.
KITTI-6DoF is a dataset that contains annotations for the 6DoF estimation task for 5 object categories on 7,481 frames.
McQueen dataset contains 15k visual conversations and over 80k queries where each one is associated with a fully-specified rewrite version. In addition, for entities appearing in the rewrite, the corresponding image box annotation is provided.
VD-Ref is a dataset with ground-truth mappings from both noun phrases and pronouns to image regions. This dataset contains a set of 10k complete sets from the VisDialog dataset, and uses the StanfordCoreNLP tool to tokenize the sentences, making it proper for the succeeding human annotation.
SF-MASK is a collection made from 20k low-resolution images exported from diverse and heterogeneous datasets, ranging from 7 x 7 to 64 x 64 pixel resolution. An accurate visualization of this collection, through counting grids, made it possible to highlight gaps in the variety of poses assumed by the heads of the pedestrians.
This dataset provides a collection of 162K images and 70 Videos of Meta-Humans. There are 10 Highly realistic Meta-Humans expressing 7 facial expressions.