Datasets

19,997 machine learning datasets

19,997 dataset results

MRNet

The MRNet dataset consists of 1,370 knee MRI exams performed at Stanford University Medical Center. The dataset contains 1,104 (80.6%) abnormal exams, with 319 (23.3%) ACL tears and 508 (37.1%) meniscal tears; labels were obtained through manual extraction from clinical reports.

29 papers8 benchmarks

OneStopEnglish

Useful for through two applications - automatic readability assessment and automatic text simplification. The corpus consists of 189 texts, each in three versions (567 in total).

29 papers0 benchmarks

TVSeries

A realistic dataset composed of 27 episodes from 6 popular TV series. The dataset spans over 16 hours of footage annotated with 30 action classes, totaling 6,231 action instances.

29 papers1 benchmarksVideos

VidSTG

The VidSTG dataset is a spatio-temporal video grounding dataset constructed based on the video relation dataset VidOR. VidOR contains 7,000, 835 and 2,165 videos for training, validation and testing, respectively. The goal of the Spatio-Temporal Video Grounding task (STVG) is to localize the spatio-temporal section of an untrimmed video that matches a given sentence depicting an object. VidSTG contains 5,563, 618, and 743 videos for training, validation, and testing, respectively.

29 papers6 benchmarks

VT5000

Includes 5000 spatially aligned RGBT image pairs with ground truth annotations. VT5000 has 11 challenges collected in different scenes and environments for exploring the robustness of algorithms.

29 papers0 benchmarks

GraspNet-1Billion

GraspNet-1Billion provides large-scale training data and a standard evaluation platform for the task of general robotic grasping. The dataset contains 97,280 RGB-D image with over one billion grasp poses.

29 papers4 benchmarksRGB-D

COUGHVID

The COUGHVID dataset provides over 20,000 crowdsourced cough recordings representing a wide range of subject ages, genders, geographic locations, and COVID-19 statuses. First, the dataset was filtered using an open-sourced cough detection algorithm. Second, experienced pulmonologists labeled more than 2,000 recordings to diagnose medical abnormalities present in the coughs, thereby contributing one of the largest expert-labeled cough datasets in existence that can be used for a plethora of cough audio classification tasks.

29 papers0 benchmarksAudio

Pushshift Reddit

Pushshift makes available all the submissions and comments posted on Reddit between June 2005 and April 2019. The dataset consists of 651,778,198 submissions and 5,601,331,385 comments posted on 2,888,885 subreddits.

29 papers0 benchmarksTexts

MaRVL (Multicultural Reasoning over Vision and Language)

Multicultural Reasoning over Vision and Language (MaRVL) is a dataset based on an ImageNet-style hierarchy representative of many languages and cultures (Indonesian, Mandarin Chinese, Swahili, Tamil, and Turkish). The selection of both concepts and images is entirely driven by native speakers. Afterwards, we elicit statements from native speakers about pairs of images. The task consists in discriminating whether each grounded statement is true or false.

29 papers2 benchmarksImages, Texts

PartImageNet

PartImageNet is a large, high-quality dataset with part segmentation annotations. It consists of 158 classes from ImageNet with approximately 24000 images. PartImageNet offers part-level annotations on a general set of classes with non-rigid, articulated objects, while having an order of magnitude larger size compared to existing datasets. It can be utilized in multiple vision tasks including but not limited to: Part Discovery, Semantic Segmentation, Few-shot Learning.

29 papers0 benchmarksImages

DSIFN-CD

The dataset is manually collected from Google Earth. It consists of six large bi-temporal high resolution images covering six cities (i.e., Beijing, Chengdu, Shenzhen, Chongqing, Wuhan, Xian) in China. The five large image-pairs (i.e., Beijing, Chengdu, Shenzhen, Chongqing, Wuhan) are clipped into 394 subimage pairs with sizes of 512×512. After data augmentation, a collection of 3940 bi-temporal image pairs is acquired. Xian image pair is clipped into 48 image pairs for model testing. There are 3600 image pairs in the training dataset, 340 image paris in the validation dataset, and 48 image pairs in the test dataset.

29 papers7 benchmarks

RainCityscapes

A dataset for rain removal with scene depth information. Compared with previous datasets, this dataset are all outdoor photos, each with a depth map, and the rain images exhibit different degrees of rain and fog.

29 papers2 benchmarks

MassiveText

MassiveText is a collection of large English-language text datasets from multiple sources: web pages, books, news articles, and code. The data pipeline includes text quality filtering, removal of repetitious text, deduplication of similar documents, and removal of documents with significant test-set overlap. MassiveText contains 2.35 billion documents or about 10.5 TB of text.

29 papers0 benchmarksTexts

OpenImage-O

It is manually annotated, comes with a naturally diverse distribution, and has a large scale. It is built to overcome several shortcomings of existing OOD benchmarks. OpenImage-O is image-by-image filtered from the test set of OpenImage-V3, which has been collected from Flickr without a predefined list of class names or tags, leading to natural class statistics and avoiding an initial design bias.

29 papers0 benchmarks

OmniBenchmark

Omni-Realm Benchmark (OmniBenchmark) is a diverse (21 semantic realm-wise datasets) and concise (realm-wise datasets have no concepts overlapping) benchmark for evaluating pre-trained model generalization across semantic super-concepts/realms, e.g. across mammals to aircraft.

29 papers1 benchmarksImages

KADID-10k

Konstanz artificially distorted image quality database (KADID-10k) contains 81 pristine images, each degraded by 25 distortions in 5 levels.

29 papers4 benchmarks

Multiface

Multiface consists of high quality recordings of the faces of 13 identities, each captured in a multi-view capture stage performing various facial expressions. An average of 12,200 (v1 scripts) to 23,000 (v2 scripts) frames per subject with capture rate at 30 fps. Each frame includes roughly 40 (v1) to 160 (v2) different camera views under uniform illumination, yielding a total dataset size of 65TB.

29 papers0 benchmarks

Spring (Spring: A High-Resolution High-Detail Dataset and Benchmark for Scene Flow, Optical Flow and Stereo)

Spring is a large, high-resolution and high-detail, computer-generated benchmark for scene flow, optical flow, and stereo. Based on rendered scenes from the open-source Blender movie "Spring", it provides photo-realistic HD datasets with state-of-the-art visual effects and ground truth training data.

29 papers5 benchmarksImages, Videos

NeuralRGBD

RGB-D dataset of synthetic indoor scenes with color, noisy depth map, etc.

29 papers0 benchmarks

Wizard-of-Oz

The WoZ 2.0 dataset is a newer dialogue state tracking dataset whose evaluation is detached from the noisy output of speech recognition systems. Similar to DSTC2, it covers the restaurant search domain and has identical evaluation.

28 papers2 benchmarksTexts

PreviousPage 82 of 1000Next