TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

SoccerDB

Comprises of 171,191 video segments from 346 high-quality soccer games. The database contains 702,096 bounding boxes, 37,709 essential event labels with time boundary and 17,115 highlight annotations for object detection, action recognition, temporal action localization, and highlight detection tasks.

5 papers0 benchmarks

SP-10K

A large-scale evaluation set that provides human ratings for the plausibility of 10,000 SP pairs over five SP relations, covering 2,500 most frequent verbs, nouns, and adjectives in American English.

5 papers0 benchmarksTexts

SYNTHIA-AL

Specially designed to evaluate active learning for video object detection in road scenes.

5 papers0 benchmarksVideos

Taskmaster-2

The Taskmaster-2 dataset consists of 17,289 dialogs in seven domains: restaurants (3276), food ordering (1050), movies (3047), hotels (2355), flights (2481), music (1602), and sports (3478).

5 papers0 benchmarksDialog, Texts

Tasty Videos

A collection of 2511 recipes for zero-shot learning, recognition and anticipation.

5 papers0 benchmarksTexts, Videos

Tencent ML-Images

Tencent ML-Images is a large open-source multi-label image database, including 17,609,752 training and 88,739 validation image URLs, which are annotated with up to 11,166 categories.

5 papers0 benchmarksImages

Twitter100k

Twitter100k is a large-scale dataset for weakly supervised cross-media retrieval.

5 papers0 benchmarksImages, Texts

UIT-ViNewsQA

UIT-ViNewsQA is a new corpus for the Vietnamese language to evaluate healthcare reading comprehension models. The corpus comprises 22,057 human-generated question-answer pairs. Crowd-workers create the questions and their answers based on a collection of over 4,416 online Vietnamese healthcare news articles, where the answers comprise spans extracted from the corresponding articles.

5 papers0 benchmarks

VoxClamantis

A large-scale corpus for phonetic typology, with aligned segments and estimated phoneme-level labels in 690 readings spanning 635 languages, along with acoustic-phonetic measures of vowels and sibilants.

5 papers0 benchmarksSpeech

WGISD (Embrapa Wine Grape Instance Segmentation Dataset)

Embrapa Wine Grape Instance Segmentation Dataset (WGISD) contains grape clusters properly annotated in 300 images and a novel annotation methodology for segmentation of complex objects in natural images.

5 papers0 benchmarksImages

WWW Crowd

WWW Crowd provides 10,000 videos with over 8 million frames from 8,257 diverse scenes, therefore offering a comprehensive dataset for the area of crowd understanding.

5 papers0 benchmarksImages, Videos

Sarcasm Corpus V2

The Sarcasm Corpus contains sarcastic and non-sarcastic utterances of three different types, which are balanced with half of the samples being sarcastic and half non-sarcastic. The three types are:

5 papers0 benchmarksTexts

SpatialVOC2K

A multilingual image dataset with spatial relation annotations and object features for image-to-text generation, built using 2,026 images from the PASCAL VOC2008 dataset.

5 papers0 benchmarks

PKU-Reid

This dataset contains 114 individuals including 1824 images captured from two disjoint camera views. For each person, eight images are captured from eight different orientations under one camera view and are normalized to 128x48 pixels. This dataset is also split into two parts randomly. One contains 57 individuals for training, and the other contains 57 individuals for testing.

5 papers1 benchmarksImages

BIRD (Blocksworld Image Reasoning Dataset)

Blocksworld Image Reasoning Dataset (BIRD) contains images of wooden blocks in different configurations, and the sequence of moves to rearrange one configuration to the other.

5 papers0 benchmarksImages

A Large Dataset of Object Scans

A Large Dataset of Object Scans is a dataset of more than ten thousand 3D scans of real objects. To create the dataset, the authors recruited 70 operators, equipped them with consumer-grade mobile 3D scanning setups, and paid them to scan objects in their environments. The operators scanned objects of their choosing, outside the laboratory and without direct supervision by computer vision professionals. The result is a large and diverse collection of object scans: from shoes, mugs, and toys to grand pianos, construction vehicles, and large outdoor sculptures. The authors worked with an attorney to ensure that data acquisition did not violate privacy constraints. The acquired data was placed in the public domain and is available freely.

5 papers0 benchmarks3D, RGB-D

MSAW (Multi-Sensor All Weather Mapping)

Multi-Sensor All Weather Mapping (MSAW) is a dataset and challenge, which features two collection modalities (both SAR and optical). The dataset and challenge focus on mapping and building footprint extraction using a combination of these data sources. MSAW covers 120 km^2 over multiple overlapping collects and is annotated with over 48,000 unique building footprints labels, enabling the creation and evaluation of mapping algorithms for multi-modal data.

5 papers2 benchmarksImages

StreetStyle

StreetStyle is a large-scale dataset of photos of people annotated with clothing attributes, and use this dataset to train attribute classifiers via deep learning.

5 papers0 benchmarksImages

ELAS

ELAS is a dataset for lane detection. It contains more than 20 different scenes (in more than 15,000 frames) and considers a variety of scenarios (urban road, highways, traffic, shadows, etc.). The dataset was manually annotated for several events that are of interest for the research community (i.e., lane estimation, change, and centering; road markings; intersections; LMTs; crosswalks and adjacent lanes).

5 papers0 benchmarksImages

ObjectsRoom

The ObjectsRoom dataset is based on the MuJoCo environment used by the Generative Query Network 4 and is a multi-object extension of the 3d-shapes dataset. The training set contains 1M scenes with up to three objects. We also provide ~1K test examples for the following variants:

5 papers3 benchmarksImages
PreviousPage 213 of 1000Next