Datasets

3,275 machine learning datasets

3,275 dataset results

RED (Real Embodied Dataset)

The Real Embodied Dataset (RED) is a computer vision large-scale dataset for grasping in cluttered scenes. It contains complete segmentation masks for partially occluded objects, with their order of occlusion.

4 papers0 benchmarksImages

S2TLD (SJTU Small Traffic Light Dataset)

S2TLD is a traffic light dataset, which contains 5,786 images of approximately 1,080 * 1,920 pixels and 720 * 1,280 pixels. It also contains 5 categories (include red, yellow, green, off and wait on) of 1,4130 instances. The scenes cover a decent variety of road scenes and typical: * Busy street scenes inner-city, * Dense stop-and-go traffic * Strong changes in illumination/exposure * Flickering/Fluctuating traffic lights * Multiple visible traffic lights * Image parts that can be confused with traffic lights (e.g. large round tail lights)

4 papers0 benchmarksImages

ShapenetRender

ShapenetRenderer is an extension of the ShapeNet Core dataset which has more variation in camera angles. For each mesh model, the dataset provides 36 views with smaller variation and 36 views with larger variation. The resolution of the newly rendered images is 224x224 in contrast to the 137x137 original resolution. Additionally, each RGB image is paired with a depth image, a normal map and an albedo image.

4 papers0 benchmarksImages

Synthinel-1

Synthinel-1 is a collection of synthetic overhead imagery with full pixel-wise building segmentation labels.

4 papers0 benchmarksImages

TCG (Traffic Control Gesture)

The TCG dataset is used to evaluate Traffic Control Gesture recognition for autonomous driving. The dataset is based on 3D body skeleton input to perform traffic control gesture classification on every time step. The dataset consists of 250 sequences from several actors, ranging from 16 to 90 seconds per sequence.

4 papers0 benchmarksImages

UTA-RLDD (University of Texas at Arlington Real-Life Drowsiness Dataset)

Consists of around 30 hours of video, with contents ranging from subtle signs of drowsiness to more obvious ones.

4 papers0 benchmarksImages

VIENA2

Covers 5 generic driving scenarios, with a total of 25 distinct action classes. It contains more than 15K full HD, 5s long videos acquired in various driving conditions, weathers, daytimes and environments, complemented with a common and realistic set of sensor measurements. This amounts to more than 2.25M frames, each annotated with an action label, corresponding to 600 samples per action class.

4 papers0 benchmarksImages

Verse

Verse is a new dataset that augments existing multimodal datasets (COCO and TUHOI) with sense labels.

4 papers0 benchmarksImages

RDD-2020 (Road Damage Dataset 2020)

The Road Damage Dataset 2020 (RDD-2020) Secondly is a large-scale heterogeneous dataset comprising 26620 images collected from multiple countries using smartphones. The images are collected from roads in India, Japan and the Czech Republic.

4 papers0 benchmarksImages

ScenicOrNot

ScenicOrNot (SoN) is a dataset of 185,548 images with associated natural beauty rating histograms. Each image in the dataset was rated at least five times. The images also have metadata like title and location.

4 papers0 benchmarksImages

WHU-Hi (Wuhan UAV-borne hyperspectral image)

WHU-Hi dataset (Wuhan UAV-borne hyperspectral image) is collected and shared by the RSIDEA research group of Wuhan University, and it could serve as a benchmark dataset for precise crop classification and hyperspectral image classification studies. The WHU-Hi dataset contains three individual UAV-borne hyperspectral datasets: WHU-Hi-LongKou, WHU-Hi-HanChuan, and WHU-Hi-HongHu. All the datasets were acquired in farming areas with various crop types in Hubei province, China, via a Headwall Nano-Hyperspec sensor mounted on a UAV platform. Compared with spaceborne and airborne hyperspectral platforms, unmanned aerial vehicle (UAV)-borne hyperspectral systems can acquire hyperspectral imagery with a high spatial resolution (which we refer to here as H2 imagery). The research was published in Remote Sensing of Environment.

4 papers0 benchmarksHyperspectral images, Images

Kuzushiji-Kanji

Kuzushiji-Kanji is an imbalanced dataset of total 3832 Kanji characters (64x64 grayscale, 140,426 images), ranging from 1,766 examples to only a single example per class. Kuzushiji is a Japanese cursive writing style.

4 papers0 benchmarksImages

CocoDoom

CocoDoom is a collection of pre-recorded data extracted from Doom gaming sessions along with annotations in the MS Coco format.

4 papers0 benchmarksImages

MuMu

MuMu is a new dataset of more than 31k albums classified into 250 genre classes.

4 papers0 benchmarksAudio, Images, Texts

FIRE (Fundus Image Registration Dataset)

Fundus Image Registration Dataset (FIRE) is a dataset consisting of 129 retinal images forming 134 image pairs. These image pairs are split into 3 different categories depending on their characteristics. The images were acquired with a Nidek AFC-210 fundus camera, which acquires images with a resolution of 2912x2912 pixels and a FOV of 45° both in the x and y dimensions. Images were acquired at the Papageorgiou Hospital, Aristotle University of Thessaloniki, Thessaloniki from 39 patients.

4 papers1 benchmarksImages, Medical

Imp1k

Imp1k is a new dataset of designs annotated with importance information.

4 papers0 benchmarksImages

MERL-RAV (MERL Reannotation of AFLW with Visibility)

The MERL-RAV (MERL Reannotation of AFLW with Visibility) Dataset contains over 19,000 face images in a full range of head poses. Each face is manually labeled with the ground-truth locations of 68 landmarks, with the additional information of whether each landmark is unoccluded, self-occluded (due to extreme head poses), or externally occluded. The images were annotated by professional labelers, supervised by researchers at Mitsubishi Electric Research Laboratories (MERL).

4 papers22 benchmarksImages

METU Trademark

The METU Trademark Dataset is a large dataset (the largest publicly available logo dataset as of 2014, and the largest one not requiring any preprocessing as of 2017), which is composed of more than 900K real logos belonging to real companies worldwide. The dataset also includes query sets of varying difficulties, allowing Trademark Retrieval researchers to benchmark their methods against other methods to progress the field.

4 papers0 benchmarksImages

CEDAR Signature

CEDAR Signature is a database of off-line signatures for signature verification. Each of 55 individuals contributed 24 signatures thereby creating 1,320 genuine signatures. Some were asked to forge three other writers’ signatures, eight times per subject, thus creating 1,320 forgeries. Each signature was scanned at 300 dpi gray-scale and binarized using a gray-scale histogram. Salt pepper noise removal and slant normalization were two steps involved in image preprocessing. The database has 24 genuines and 24 forgeries available for each writer.

4 papers1 benchmarksImages

K-Hairstyle

K-hairstyle is a novel large-scale Korean hairstyle dataset with 256,679 high-resolution images. In addition, K-hairstyle contains various hair attributes annotated by Korean expert hair stylists and hair segmentation masks.

4 papers0 benchmarksImages

PreviousPage 74 of 164Next