Datasets

3,275 machine learning datasets

3,275 dataset results

Tencent ML-Images

Tencent ML-Images is a large open-source multi-label image database, including 17,609,752 training and 88,739 validation image URLs, which are annotated with up to 11,166 categories.

5 papers0 benchmarksImages

Twitter100k

Twitter100k is a large-scale dataset for weakly supervised cross-media retrieval.

5 papers0 benchmarksImages, Texts

WGISD (Embrapa Wine Grape Instance Segmentation Dataset)

Embrapa Wine Grape Instance Segmentation Dataset (WGISD) contains grape clusters properly annotated in 300 images and a novel annotation methodology for segmentation of complex objects in natural images.

5 papers0 benchmarksImages

WWW Crowd

WWW Crowd provides 10,000 videos with over 8 million frames from 8,257 diverse scenes, therefore offering a comprehensive dataset for the area of crowd understanding.

5 papers0 benchmarksImages, Videos

PKU-Reid

This dataset contains 114 individuals including 1824 images captured from two disjoint camera views. For each person, eight images are captured from eight different orientations under one camera view and are normalized to 128x48 pixels. This dataset is also split into two parts randomly. One contains 57 individuals for training, and the other contains 57 individuals for testing.

5 papers1 benchmarksImages

BIRD (Blocksworld Image Reasoning Dataset)

Blocksworld Image Reasoning Dataset (BIRD) contains images of wooden blocks in different configurations, and the sequence of moves to rearrange one configuration to the other.

5 papers0 benchmarksImages

MSAW (Multi-Sensor All Weather Mapping)

Multi-Sensor All Weather Mapping (MSAW) is a dataset and challenge, which features two collection modalities (both SAR and optical). The dataset and challenge focus on mapping and building footprint extraction using a combination of these data sources. MSAW covers 120 km^2 over multiple overlapping collects and is annotated with over 48,000 unique building footprints labels, enabling the creation and evaluation of mapping algorithms for multi-modal data.

5 papers2 benchmarksImages

StreetStyle

StreetStyle is a large-scale dataset of photos of people annotated with clothing attributes, and use this dataset to train attribute classifiers via deep learning.

5 papers0 benchmarksImages

ELAS

ELAS is a dataset for lane detection. It contains more than 20 different scenes (in more than 15,000 frames) and considers a variety of scenarios (urban road, highways, traffic, shadows, etc.). The dataset was manually annotated for several events that are of interest for the research community (i.e., lane estimation, change, and centering; road markings; intersections; LMTs; crosswalks and adjacent lanes).

5 papers0 benchmarksImages

ObjectsRoom

The ObjectsRoom dataset is based on the MuJoCo environment used by the Generative Query Network 4 and is a multi-object extension of the 3d-shapes dataset. The training set contains 1M scenes with up to three objects. We also provide ~1K test examples for the following variants:

5 papers3 benchmarksImages

Lesion Boundary Segmentation Dataset

Lesion Boundary Segmentation Dataset is a dataset for lesion segmentation from the ISIC2018 challenge. The dataset contains skin lesions and their corresponding annotations.

5 papers0 benchmarksImages, Medical

IG-1B-Targeted

IG-1B-Targeted is an internal Facebook AI Research dataset that consists of 940 million public images with 1.5K hashtags matching with 1000 ImageNet1K synsets.

5 papers0 benchmarksImages, Texts

BanglaLekhaImageCaptions

This dataset consists of images and annotations in Bengali. The images are human annotated in Bengali by two adult native Bengali speakers. All popular image captioning datasets have a predominant western cultural bias with the annotations done in English. Using such datasets to train an image captioning system assumes that a good English to target language translation system exists and that the original dataset had elements of the target culture. Both these assumptions are false, leading to the need of a culturally relevant dataset in Bengali, to generate appropriate image captions of images relevant to the Bangladeshi and wider subcontinental context. The dataset presented consists of 9,154 images.

5 papers8 benchmarksImages, Texts

Brain US

This brain anatomy segmentation dataset has 1300 2D US scans for training and 329 for testing. A total of 1629 in vivo B-mode US images were obtained from 20 different subjects (age<1 years old) who were treated between 2010 and 2016. The dataset contained subjects with IVH and without (healthy subjects but in risk of developing IVH). The US scans were collected using a Philips US machine with a C8-5 broadband curved array transducer using coronal and sagittal scan planes. For every collected image ventricles and septum pellecudi are manually segmented by an expert ultrasonographer. We split these images randomly into 1300 Training images and 329 Testing images for experiments. Note that these images are of size 512 × 512.

5 papers2 benchmarksImages, Medical

TRANCE (Transformation Driven Visual Reasoning)

TRANCE extends CLEVR by asking a uniform question, i.e. what is the transformation between two given images, to test the ability of transformation reasoning. TRANCE includes three levels of settings, i.e. Basic (single-step transformation), Event (multi-step transformation), and View (multi-step transformation with variant views). Detailed information can be found in https://hongxin2019.github.io/TVR.

5 papers0 benchmarksImages

Sewer-ML

Sewer-ML is a sewer defect dataset. It contains 1.3 million images, from 75,618 videos collected from three Danish water utility companies over nine years. All videos have been annotated by licensed sewer inspectors following the Danish sewer inspection standard, Fotomanualen. This leads to consistent and reliable annotations, and a total of 17 annotated defect classes.

5 papers0 benchmarksImages

TICaM (Time-of-flight In-car Cabin Monitoring)

TICaM is a Time-of-flight In-car Cabin Monitoring dataset for vehicle interior monitoring using a single wide-angle depth camera. This dataset addresses the deficiencies of other available in-car cabin datasets in terms of the ambit of labeled classes, recorded scenarios and provided annotations; all at the same time. It consists of an exhaustive list of actions performed while driving and multi-modal labeled images (depth, RGB and IR), with complete annotations for 2D and 3D object detection, instance and semantic segmentation as well as activity annotations for RGB frames. Additional to real recordings, it also contains a synthetic dataset of in-car cabin images with same multi-modality of images and annotations, providing a unique and extremely beneficial combination of synthetic and real data for effectively training cabin monitoring systems and evaluating domain adaptation approaches.

5 papers0 benchmarksImages, RGB-D

AHP (Amodal Human Perception)

The AHP dataset consists of 56,599 images in total which are collected from several large-scale instance segmentation and detection datasets, including COCO, VOC (w/ SBD), LIP, Objects365 and OpenImages. Each image is annotated with a pixel-level segmentation mask of a single integrated human.

5 papers0 benchmarksImages

TAS500

TAS500 is a semantic segmentation dataset for autonomous driving in unstructured environments. TAS500 offers fine-grained vegetation and terrain classes to learn drivable surfaces and natural obstacles in outdoor scenes effectively.

5 papers0 benchmarksImages

READ 2016 (HTR Dataset ICFHR 2016)

This dataset arises from the READ project (Horizon 2020).

5 papers4 benchmarksImages, Texts

PreviousPage 68 of 164Next