Datasets

3,275 machine learning datasets

3,275 dataset results

BCCD

BCCD is a small-scale dataset for blood cells detection.

4 papers0 benchmarksImages

MultiSense

MultiSense is a dataset of 9,504 images annotated with an English verb and its translation in Spanish and German.

4 papers0 benchmarksImages, Texts

STAIR Captions

STAIR Captions is a large-scale dataset containing 820,310 Japanese captions. This dataset can be used for caption generation, multimodal retrieval, and image generation.

4 papers0 benchmarksImages, Texts

The 3DNet dataset is a free resource for object class recognition and 6DOF pose estimation from point cloud data. 3DNet provides a large-scale hierarchical CAD-model databases with increasing numbers of classes and difficulty with 10, 60 and 200 object classes together with evaluation datasets that contain thousands of scenes captured with an RGB-D sensor.

4 papers0 benchmarksImages

CSAW-S

CSAW-S is a dataset of mammography images which includes expert annotations of tumors and non-expert annotations of breast anatomy and artifacts in the image.

4 papers0 benchmarksImages

FIGR-8

The FIGR-8 database is a dataset containing 17,375 classes of 1,548,256 images representing pictograms, ideograms, icons, emoticons or object or conception depictions. Its aim is to set a benchmark for Few-shot Image Generation tasks, albeit not being limited to it. Each image is represented by 192x192 pixels with grayscale value of 0-255. Classes are not balanced (they do not all contain the same number of elements), but they all do contain at the very least 8 images.

4 papers0 benchmarksImages

EDEN

EDEN (Enclosed garDEN) is a multimodal synthetic dataset, a dataset for nature-oriented applications. The dataset features more than 300K images captured from more than 100 garden models. Each image is annotated with various low/high-level vision modalities, including semantic segmentation, depth, surface normals, intrinsic colors, and optical flow.

4 papers0 benchmarksImages, RGB-D

3D Ken Burns Dataset

Provides a large-scale synthetic dataset which contains accurate ground truth depth of various photo-realistic scenes.

4 papers0 benchmarksImages, RGB-D

AeroRIT

AeroRIT is a hyperspectral dataset to facilitate aerial hyperspectral scene understanding.

4 papers0 benchmarksImages

Brno-Urban-Dataset

This self-driving dataset collected in Brno, Czech Republic contains data from four WUXGA cameras, two 3D LiDARs, inertial measurement unit, infrared camera and especially differential RTK GNSS receiver with centimetre accuracy.

4 papers0 benchmarksImages

Danbooru2020

A large-scale anime image database with 4.2m+ images annotated with 130m+ text tags describing image contents in detail; it can be useful for machine learning purposes such as image recognition and generation. It has been applied to a wide variety of applications, particularly generative modeling.

4 papers0 benchmarksImages, Texts

DDI-100 (Distorted Document Images)

The DDI-100 dataset is a synthetic dataset for text detection and recognition based on 7000 real unique document pages and consists of more than 100000 augmented images. The ground truth comprises text and stamp masks, text and characters bounding boxes with relevant annotations.

4 papers0 benchmarksImages

Dunhuang Grottoes Painting Dataset

This dataset provides a large number of training and testing example which is sufficient for a deep learning approach to address Dunhuang Grotto Painting restoration.

4 papers0 benchmarksImages

EDUB-Seg (Egocentric Dataset of the University of Barcelona – Segmentation)

Egocentric Dataset of the University of Barcelona – Segmentation (EDUB-Seg) is a dataset for egocentric event segmentation acquired by the Narrative Clip, which takes a picture every 30 seconds. The dataset contains a total of 18,735 images captured by 7 different users during overall 20 days. To ensure diversity, all users were wearing the camera in different contexts: while attending a conference, on holiday, during the weekend, and during the week.

4 papers0 benchmarksImages

EMU (Edited Media Understanding)

48k question-answer pairs written in rich natural language.

4 papers0 benchmarksImages, Texts

EndoSLAM (Endoscopic SLAM dataset)

The endoscopic SLAM dataset (EndoSLAM) is a dataset for depth estimation approach for endoscopic videos. It consists of both ex-vivo and synthetically generated data. The ex-vivo part of the dataset includes standard as well as capsule endoscopy recordings. The dataset is divided into 35 sub-datasets. Specifically, 18, 5 and 12 sub-datasets exist for colon, small intestine and stomach respectively.

4 papers0 benchmarksImages

ESAD (SARAS Endoscopic Surgeon Action Detection)

ESAD is a large-scale dataset designed to tackle the problem of surgeon action detection in endoscopic minimally invasive surgery. ESAD aims at contributing to increase the effectiveness and reliability of surgical assistant robots by realistically testing their awareness of the actions performed by a surgeon. The dataset provides bounding box annotation for 21 action classes on real endoscopic video frames captured during prostatectomy, and was used as the basis of a recent MIDL 2020 challenge.

4 papers0 benchmarksImages, Medical

Hyperspectral City

Propose a dataset which adopts multi-channel visual input.

4 papers8 benchmarksImages

Kitchen Scenes

Kitchen Scenes is a multi-view RGB-D dataset of nine kitchen scenes, each containing several objects in realistic cluttered environments including a subset of objects from the BigBird dataset. The viewpoints of the scenes are densely sampled and objects in the scenes are annotated with bounding boxes and in the 3D point cloud.

4 papers0 benchmarks3D, Images, Videos

MJU-Waste

MJU-Waste is an RGBD waste object segmentation dataset that is made public to facilitate future research in this area.

4 papers5 benchmarksImages

PreviousPage 73 of 164Next