Datasets

3,275 machine learning datasets

3,275 dataset results

PathVQA

PathVQA consists of 32,799 open-ended questions from 4,998 pathology images where each question is manually checked to ensure correctness.

89 papers0 benchmarksImages, Medical, Texts

ImageNet-O consists of images from classes that are not found in the ImageNet-1k dataset. It is used to test the robustness of vision models to out-of-distribution samples. It's reported using the AUPR metric.

89 papers0 benchmarksImages

PPMI (Parkinson’s Progression Markers Initiative)

The Parkinson’s Progression Markers Initiative (PPMI) dataset originates from an observational clinical and longitudinal study comprising evaluations of people with Parkinson’s disease (PD), those people with high risk, and those who are healthy.

87 papers1 benchmarksImages, Medical

ONCE (One Million Scenes)

ONCE (One millioN sCenEs) is a dataset for 3D object detection in the autonomous driving scenario. The ONCE dataset consists of 1 million LiDAR scenes and 7 million corresponding camera images. The data is selected from 144 driving hours, which is 20x longer than other 3D autonomous driving datasets available like nuScenes and Waymo, and it is collected across a range of different areas, periods and weather conditions.

87 papers6 benchmarksImages, LiDAR

DICM

DICM is a dataset for low-light enhancement which consists of 69 images collected with commercial digital cameras.

86 papers3 benchmarksImages

DIODE (Dense Indoor and Outdoor Depth)

Diode Dense Indoor/Outdoor DEpth (DIODE) is the first standard dataset for monocular depth estimation comprising diverse indoor and outdoor scenes acquired with the same hardware setup. The training set consists of 8574 indoor and 16884 outdoor samples from 20 scans each. The validation set contains 325 indoor and 446 outdoor samples with each set from 10 different scans. The ground truth density for the indoor training and validation splits are approximately 99.54% and 99%, respectively. The density of the outdoor sets are naturally lower with 67.19% for training and 78.33% for validation subsets. The indoor and outdoor ranges for the dataset are 50m and 300m, respectively.

86 papers6 benchmarksImages, RGB-D

AbstractReasoning

AbstractReasoning is a dataset for abstract reasoning, where the goal is to infer the correct answer from the context panels based on abstract reasoning.

86 papers0 benchmarksImages

CULane

CULane is a large scale challenging dataset for academic research on traffic lane detection. It is collected by cameras mounted on six different vehicles driven by different drivers in Beijing. More than 55 hours of videos were collected and 133,235 frames were extracted. The dataset is divided into 88880 images for training set, 9675 for validation set, and 34680 for test set. The test set is divided into normal and 8 challenging categories.

85 papers4 benchmarksImages, Videos

BigEarthNet

BigEarthNet consists of 590,326 Sentinel-2 image patches, each of which is a section of i) 120x120 pixels for 10m bands; ii) 60x60 pixels for 20m bands; and iii) 20x20 pixels for 60m bands.

85 papers8 benchmarksHyperspectral images, Images

Structured3D

Structured3D is a large-scale photo-realistic dataset containing 3.5K house designs (a) created by professional designers with a variety of ground truth 3D structure annotations (b) and generate photo-realistic 2D images (c). The dataset consists of rendering images and corresponding ground truth annotations (e.g., semantic, albedo, depth, surface normal, layout) under different lighting and furniture configurations.

85 papers4 benchmarks3D, Images

PROMISE12

The PROMISE12 dataset was made available for the MICCAI 2012 prostate segmentation challenge. Magnetic Resonance (MR) images (T2-weighted) of 50 patients with various diseases were acquired at different locations with several MRI vendors and scanning protocols.

84 papers1 benchmarksImages, MRI, Medical

VITON-HD (High-Resolution VITON-Zalando Dataset)

VITON-HD dataset is a dataset for high-resolution (i.e., 1024x768) virtual try-on of clothing items. Specifically, it consists of 13,679 frontal-view woman and top clothing image pairs.

84 papers5 benchmarksImages

NLVR (Natural Language Visual Reasoningnatural language for visual reasoning)

NLVR contains 92,244 pairs of human-written English sentences grounded in synthetic images. Because the images are synthetically generated, this dataset can be used for semantic parsing.

83 papers3 benchmarksImages, Texts

RaFD (Radboud Faces Database)

The Radboud Faces Database (RaFD) is a set of pictures of 67 models (both adult and children, males and females) displaying 8 emotional expressions.

81 papers9 benchmarksImages

iSAID

iSAID contains 655,451 object instances for 15 categories across 2,806 high-resolution images. The images of iSAID is the same as the DOTA-v1.0 dataset, which are manily collected from the Google Earth, some are taken by satellite JL-1, the others are taken by satellite GF-2 of the China Centre for Resources Satellite Data and Application.

81 papers9 benchmarksImages

ChestX-ray8

ChestX-ray8 is a medical imaging dataset which comprises 108,948 frontal-view X-ray images of 32,717 (collected from the year of 1992 to 2015) unique patients with the text-mined eight common disease labels, mined from the text radiological reports via NLP techniques.

81 papers0 benchmarksImages, Medical

LoveDA (Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation)

5987 high spatial resolution (0.3 m) remote sensing images from Nanjing, Changzhou, and Wuhan Focus on different geographical environments between Urban and Rural Advance both semantic segmentation and domain adaptation tasks Three considerable challenges: Multi-scale objects Complex background samples Inconsistent class distributions

81 papers2 benchmarksImages

Oulu-CASIA (Oulu-CASIA NIR&VIS facial expression database)

The Oulu-CASIA NIR&VIS facial expression database consists of six expressions (surprise, happiness, sadness, anger, fear and disgust) from 80 people between 23 and 58 years old. 73.8% of the subjects are males. The subjects were asked to sit on a chair in the observation room in a way that he/ she is in front of camera. Camera-face distance is about 60 cm. Subjects were asked to make a facial expression according to an expression example shown in picture sequences. The imaging hardware works at the rate of 25 frames per second and the image resolution is 320 × 240 pixels.

80 papers12 benchmarksImages, Videos

Volleyball

Volleyball is a video action recognition dataset. It has 4830 annotated frames that were handpicked from 55 videos with 9 player action labels and 8 team activity labels. It contains group activity annotations as well as individual activity annotations.

80 papers5 benchmarksImages, Videos

LFSD (Light Field Saliency Database)

The Light Field Saliency Database (LFSD) contains 100 light fields with 360×360 spatial resolution. A rough focal stack and an all-focus image are provided for each light field. The images in this dataset usually have one salient foreground object and a background with good color contrast.

80 papers20 benchmarksImages

PreviousPage 17 of 164Next