TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

DeepGlobe

We observe that satellite imagery is a powerful source of information as it contains more structured and uniform data, compared to traditional images. Although computer vision community has been accomplishing hard tasks on everyday image datasets using deep learning, satellite images are only recently gaining attention for maps and population analysis. This workshop aims at bringing together a diverse set of researchers to advance the state-of-the-art in satellite image analysis.

127 papers9 benchmarks

SMAP (Soil Moisture Active Passive)

Soil Moisture Active Passive (SMAP) dataset is a dataset of soil samples and telemetry information using the Mars rover by NASA. Originally published in https://arxiv.org/abs/1802.04431 and used for the unsupervised anomaly detection task in time series data. Later it was used in many popular anomaly detection methods and benchmarks that distribute it in their repositories e.g., https://github.com/OpsPAI/MTAD

127 papers16 benchmarksTime series

PASCAL VOC 2007

PASCAL VOC 2007 is a dataset for image recognition. The twenty object classes that have been selected are:

126 papers66 benchmarksImages

LSMDC (Large Scale Movie Description Challenge)

This dataset contains 118,081 short video clips extracted from 202 movies. Each video has a caption, either extracted from the movie script or from transcribed DVS (descriptive video services) for the visually impaired. The validation set contains 7408 clips and evaluation is performed on a test set of 1000 videos from movies disjoint from the training and val sets.

126 papers29 benchmarksAudio, Texts, Videos

FBMS (Freiburg-Berkeley Motion Segmentation)

The Freiburg-Berkeley Motion Segmentation Dataset (FBMS-59) is an extension of the BMS dataset with 33 additional video sequences. A total of 720 frames is annotated. It has pixel-accurate segmentation annotations of moving objects. FBMS-59 comes with a split into a training set and a test set.

126 papers4 benchmarksImages, Videos

SUN3D

SUN3D contains a large-scale RGB-D video database, with 8 annotated sequences. Each frame has a semantic segmentation of the objects in the scene and information about the camera pose. It is composed by 415 sequences captured in 254 different spaces, in 41 different buildings. Moreover, some places have been captured multiple times at different moments of the day.

126 papers0 benchmarks3D, Images, Point cloud, RGB-D, Videos

BLiMP (Benchmark of Linguistic Minimal Pairs)

BLiMP is a challenge set for evaluating what language models (LMs) know about major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each containing 1000 minimal pairs isolating specific contrasts in syntax, morphology, or semantics. The data is automatically generated according to expert-crafted grammars. Aggregate human agreement with the labels is 96.4%.

126 papers0 benchmarksTexts

DFDC (Deepfake Detection Challenge)

The DFDC (Deepfake Detection Challenge) is a dataset for deepface detection consisting of more than 100,000 videos.

126 papers8 benchmarksVideos

LUNA

The LUNA challenges provide datasets for automatic nodule detection algorithms using the largest publicly available reference database of chest CT scans, the LIDC-IDRI data set. In LUNA16, participants develop their algorithm and upload their predictions on 888 CT scans in one of the two tracks: 1) the complete nodule detection track where a complete CAD system should be developed, or 2) the false positive reduction track where a provided set of nodule candidates should be classified.

125 papers4 benchmarksImages, Medical

FreiHAND

FreiHAND is a 3D hand pose dataset which records different hand actions performed by 32 people. For each hand image, MANO-based 3D hand pose annotations are provided. It currently contains 32,560 unique training samples and 3960 unique samples for evaluation. The training samples are recorded with a green screen background allowing for background removal. In addition, it applies three different post processing strategies to training samples for data augmentation. However, these post processing strategies are not applied to evaluation samples.

125 papers24 benchmarks3D, Images

Aff-Wild

Aff-Wild is a large-scale in-the-wild dataset for valence-arousal estimation from videos with a variety of head poses, illumination conditions and occlusions.

125 papers0 benchmarksImages

Yelp2018

The Yelp2018 dataset is adopted from the 2018 edition of the yelp challenge. Wherein local businesses like restaurants and bars are viewed as items. We use the same 10-core setting in order to ensure data quality.

125 papers9 benchmarks

QuickDraw

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

125 papers0 benchmarks

MSRA-TD500 (MSRA Text Detection 500 Database)

The MSRA-TD500 dataset is a text detection dataset that contains 300 training images and 200 test images. Text regions are arbitrarily orientated and annotated at sentence level. Different from the other datasets, it contains both English and Chinese text.

124 papers4 benchmarksImages, Texts

FER+ (Face Expression Recognition Plus dataset)

The FER+ dataset is an extension of the original FER dataset, where the images have been re-labelled into one of 8 emotion types: neutral, happiness, surprise, sadness, anger, disgust, fear, and contempt.

124 papers7 benchmarksImages

NCLT (North Campus Long-Term Vision and LiDAR)

The NCLT dataset is a large scale, long-term autonomy dataset for robotics research collected on the University of Michigan’s North Campus. The dataset consists of omnidirectional imagery, 3D lidar, planar lidar, GPS, and proprioceptive sensors for odometry collected using a Segway robot. The dataset was collected to facilitate research focusing on long-term autonomous operation in changing environments. The dataset is comprised of 27 sessions spaced approximately biweekly over the course of 15 months. The sessions repeatedly explore the campus, both indoors and outdoors, on varying trajectories, and at different times of the day across all four seasons. This allows the dataset to capture many challenging elements including: moving obstacles (e.g., pedestrians, bicyclists, and cars), changing lighting, varying viewpoint, seasonal and weather changes (e.g., falling leaves and snow), and long-term structural changes caused by construction projects.

124 papers0 benchmarksImages, Interactive

Microsoft Academic Graph

The Microsoft Academic Graph is a heterogeneous graph containing scientific publication records, citation relationships between those publications, as well as authors, institutions, journals, conferences, and fields of study.

124 papers0 benchmarksTexts

JFT-300M

JFT-300M is an internal Google dataset used for training image classification models. Images are labeled using an algorithm that uses complex mixture of raw web signals, connections between web-pages and user feedback. This results in over one billion labels for the 300M images (a single image can have multiple labels). Of the billion image labels, approximately 375M are selected via an algorithm that aims to maximize label precision of selected images.

123 papers1 benchmarksImages

CARER (Contextualized Affect Representations for Emotion Recognition)

CARER is an emotion dataset collected through noisy labels, annotated via distant supervision as in (Go et al., 2009).

123 papers0 benchmarksTexts

PubLayNet

PubLayNet is a dataset for document layout analysis by automatically matching the XML representations and the content of over 1 million PDF articles that are publicly available on PubMed Central. The size of the dataset is comparable to established computer vision datasets, containing over 360 thousand document images, where typical document layout elements are annotated.

123 papers0 benchmarksImages
PreviousPage 28 of 1000Next