Datasets

3,275 machine learning datasets

3,275 dataset results

SeaDronesSee (SeaDronesSee: A Maritime Benchmark for Detecting Humans in Open Water)

SeaDronesSee is a large-scale data set aimed at helping develop systems for Search and Rescue (SAR) using Unmanned Aerial Vehicles (UAVs) in maritime scenarios. Building highly complex autonomous UAV systems that aid in SAR missions requires robust computer vision algorithms to detect and track objects or persons of interest. This data set provides three sets of tracks: object detection, single-object tracking and multi-object tracking. Each track consists of its own data set and leaderboard.

18 papers14 benchmarksImages, Videos

Endomapper

The Endomapper dataset is the first collection of complete endoscopy sequences acquired during regular medical practice, including slow and careful screening explorations, making secondary use of medical data. Its original purpose is to facilitate the development and evaluation of VSLAM (Visual Simultaneous Localization and Mapping) methods in real endoscopy data. The first release of the dataset is composed of 50 sequences with a total of more than 13 hours of video. It is also the first endoscopic dataset that includes both the computed geometric and photometric endoscope calibration as well as the original calibration videos. Meta-data and annotations associated to the dataset varies from anatomical landmark and description of the procedure labeling, tools segmentation masks, COLMAP 3D reconstructions, simulated sequences with groundtruth and meta-data related to special cases, such as sequences from the same patient. This information will improve the research in endoscopic VSLAM, a

18 papers0 benchmarksImages, Medical

bFFHQ (Gender-biased FFHQ dataset)

Gender-biased FFHQ dataset (bFFHQ) has age as a target label and gender as a correlated bias, and the images are from the FFHQ dataset. The images include the dominant number of young women (i.e., aged 10-29) and old men (i.e., aged 40-59) in the training data.

18 papers5 benchmarksImages

MSU FR VQA Database (MSU Full-Reference Video Quality Assessment Database)

The dataset was created for video quality assessment problem. It was formed with 36 clips from Vimeo, which were selected from 18,000+ open-source clips with high bitrate (license CCBY or CC0).

18 papers10 benchmarksImages, Videos

TID2013

TID2013 is a dataset for image quality assessment that contains 25 reference images and 3000 distorted images (25 reference images x 24 types of distortions x 5 levels of distortions).

17 papers4 benchmarksImages

ApolloCar3D

ApolloCar3DT is a dataset that contains 5,277 driving images and over 60K car instances, where each car is fitted with an industry-grade 3D CAD model with absolute model size and semantically labelled keypoints. This dataset is above 20 times larger than PASCAL3D+ and KITTI, the current state-of-the-art.

17 papers14 benchmarksImages

Kumar

The Kumar dataset contains 30 1,000×1,000 image tiles from seven organs (6 breast, 6 liver, 6 kidney, 6 prostate, 2 bladder, 2 colon and 2 stomach) of The Cancer Genome Atlas (TCGA) database acquired at 40× magnification. Within each image, the boundary of each nucleus is fully annotated.

17 papers4 benchmarksImages, Interactive

New College

The New College Data is a freely available dataset collected from a robot completing several loops outdoors around the New College campus in Oxford. The data includes odometry, laser scan, and visual information. The dataset URL is not working anymore.

17 papers0 benchmarksImages

SceneNet

SceneNet is a dataset of labelled synthetic indoor scenes. There are several labeled indoor scenes, including:

17 papers0 benchmarks3D, Images

CDTB (Color-and-Depth Tracking)

Source: https://www.vicos.si/Projects/CDTB 4.2 State-of-the-art Comparison A TH CTB (color-and-depth visual object tracking) dataset is recorded by several passive and active RGB-D setups and contains indoor as well as outdoor sequences acquired in direct sunlight. The sequences were recorded to contain significant object pose change, clutter, occlusion, and periods of long-term target absence to enable tracker evaluation under realistic conditions. Sequences are per-frame annotated with 13 visual attributes for detailed analysis. It contains around 100,000 samples. Image Source: https://www.vicos.si/Projects/CDTB

17 papers0 benchmarksImages, RGB-D

CASIA-HWDB

CASIA-HWDB is a dataset for handwritten Chinese character recognition. It contains 300 files (240 in HWDB1.1 training set and 60 in HWDB1.1 test set). Each file contains about 3000 isolated gray-scale Chinese character images written by one writer, as well as their corresponding labels.

17 papers0 benchmarksImages

D-HAZY

The D-HAZY dataset is generated from NYU depth indoor image collection. D-HAZY contains depth map for each indoor hazy image. It contains 1400+ real images and corresponding depth maps used to synthesize hazy scenes based on Koschmieder’s light propagation mode

17 papers0 benchmarksImages

CLEVR-Ref+

CLEVR-Ref+ is a synthetic diagnostic dataset for referring expression comprehension. The precise locations and attributes of the objects are readily available, and the referring expressions are automatically associated with functional programs. The synthetic nature allows control over dataset bias (through sampling strategy), and the modular programs enable intermediate reasoning ground truth without human annotators.

17 papers2 benchmarksImages, Texts

FMD (Fluorescence Microscopy Denoising)

The Fluorescence Microscopy Denoising (FMD) dataset is dedicated to Poisson-Gaussian denoising. The dataset consists of 12,000 real fluorescence microscopy images obtained with commercial confocal, two-photon, and wide-field microscopes and representative biological samples such as cells, zebrafish, and mouse brain tissues. Image averaging is used to effectively obtain ground truth images and 60,000 noisy images with different noise levels.

17 papers3 benchmarksImages

MLRSNet

MLRSNet is a a multi-label high spatial resolution remote sensing dataset for semantic scene understanding. It provides different perspectives of the world captured from satellites. That is, it is composed of high spatial resolution optical satellite images. MLRSNet contains 109,161 remote sensing images that are annotated into 46 categories, and the number of sample images in a category varies from 1,500 to 3,000. The images have a fixed size of 256×256 pixels with various pixel resolutions (~10m to 0.1m). Moreover, each image in the dataset is tagged with several of 60 predefined class labels, and the number of labels associated with each image varies from 1 to 13. The dataset can be used for multi-label based image classification, multi-label based image retrieval, and image segmentation.

17 papers2 benchmarksImages

RPC (Retail Product Checkout)

RPC is a large-scale retail product checkout dataset and collects 200 retail SKUs. The collected SKUs can be divided into 17 meta categories, i.e., puffed food, dried fruit, dried food, instant drink, instant noodles, dessert, drink, alcohol, milk, canned food, chocolate, gum, candy, seasoner, personal hygiene, tissue, stationery.

17 papers0 benchmarksImages

MoNuSeg

The dataset for this challenge was obtained by carefully annotating tissue images of several patients with tumors of different organs and who were diagnosed at multiple hospitals. This dataset was created by downloading H&E stained tissue images captured at 40x magnification from TCGA archive. H&E staining is a routine protocol to enhance the contrast of a tissue section and is commonly used for tumor assessment (grading, staging, etc.). Given the diversity of nuclei appearances across multiple organs and patients, and the richness of staining protocols adopted at multiple hospitals, the training datatset will enable the development of robust and generalizable nuclei segmentation techniques that will work right out of the box.

17 papers7 benchmarksImages, Medical

VQA-E

VQA-E is a dataset for Visual Question Answering with Explanation, where the models are required to generate and explanation with the predicted answer. The VQA-E dataset is automatically derived from the VQA v2 dataset by synthesizing a textual explanation for each image-question-answer triple.

17 papers0 benchmarksImages, Texts

Shiny dataset

The shiny folder contains 8 scenes with challenging view-dependent effects used in our paper. We also provide additional scenes in the shiny_extended folder. The test images for each scene used in our paper consist of one of every eight images in alphabetical order.

17 papers0 benchmarksImages

York Urban Line Segment Database

The York Urban Line Segment Database is a compilation of 102 images (45 indoor, 57 outdoor) of urban environments consisting mostly of scenes from the campus of York University and downtown Toronto, Canada. The images are 640 x 480 in size and have been taken with a calibrated Panasonic Lumix DMC-LC80 digital camera.

17 papers0 benchmarksImages

PreviousPage 40 of 164Next