TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

YouTube-100M (YouTube-100m)

The YouTube-100M data set consists of 100 million YouTube videos: 70M training videos, 10M evaluation videos, and 20M validation videos. Videos average 4.6 minutes each for a total of 5.4M training hours. Each of these videos is labeled with 1 or more topic identifiers from a set of 30,871 labels. There are an average of around 5 labels per video. The labels are assigned automatically based on a combination of metadata (title, description, comments, etc.), context, and image content for each video. The labels apply to the entire video and range from very generic (e.g. “Song”) to very specific (e.g. “Cormorant”). Being machine generated, the labels are not 100% accurate and of the 30K labels, some are clearly acoustically relevant (“Trumpet”) and others are less so (“Web Page”). Videos often bear annotations with multiple degrees of specificity. For example, videos labeled with “Trumpet” are often labeled “Entertainment” as well, although no hierarchy is enforced.

8 papers0 benchmarksAudio, Videos

TUT Sound Events 2017

The TUT Sound Events 2017 dataset contains 24 audio recordings in a street environment and contains 6 different classes. These classes are: brakes squeaking, car, children, large vehicle, people speaking, and people walking.

8 papers0 benchmarksAudio

Google

8 papers0 benchmarks

WikiTableT

WikiTableT contains Wikipedia article sections and their corresponding tabular data and various metadata. WikiTableT contains millions of instances while covering a broad range of topics and a variety of kinds of generation tasks.

8 papers0 benchmarksTexts

AIDER

Dataset aimed to do automated aerial scene classification of disaster events from on-board a UAV.

8 papers1 benchmarks

AnimalWeb

A large-scale, hierarchical annotated dataset of animal faces, featuring 21.9K faces from 334 diverse species and 21 animal orders across biological taxonomy. These faces are captured `in-the-wild' conditions and are consistently annotated with 9 landmarks on key facial features. The proposed dataset is structured and scalable by design; its development underwent four systematic stages involving rigorous, manual annotation effort of over 6K man-hours.

8 papers0 benchmarks

APRICOT

APRICOT is a collection of over 1,000 annotated photographs of printed adversarial patches in public locations. The patches target several object categories for three COCO-trained detection models, and the photos represent natural variation in position, distance, lighting conditions, and viewing angle.

8 papers0 benchmarksImages

ArSentD-LEV

The Arabic Sentiment Twitter Dataset for the Levantine dialect (ArSenTD-LEV) is a dataset of 4,000 tweets with the following annotations: the overall sentiment of the tweet, the target to which the sentiment was expressed, how the sentiment was expressed, and the topic of the tweet.

8 papers0 benchmarksTexts

ATRW (Amur Tiger Re-identification in the Wild)

The ATRW Dataset contains over 8,000 video clips from 92 Amur tigers, with bounding box, pose keypoint, and tiger identity annotations.

8 papers0 benchmarksImages

Cityscapes Panoptic Parts

The Cityscapes Panoptic Parts dataset introduces part-aware panoptic segmentation annotations for the Cityscapes dataset. It extends the original panoptic annotations for the Cityscapes dataset with part-level annotations for selected scene-level classes.

8 papers1 benchmarksImages

CMRC 2019 (Chinese Machine Reading Comprehension 2019)

CMRC 2019 is a Chinese Machine Reading Comprehension dataset that was used in The Third Evaluation Workshop on Chinese Machine Reading Comprehension. Specifically, CMRC 2019 is a sentence cloze-style machine reading comprehension dataset that aims to evaluate the sentence-level inference ability.

8 papers0 benchmarksTexts

CoNLL-2000

CoNLL-2000 is a dataset for dividing text into syntactically related non-overlapping groups of words, so-called text chunking.

8 papers0 benchmarksTexts

Cops-Ref

Cops-Ref is a dataset for visual reasoning in context of referring expression comprehension with two main features.

8 papers0 benchmarksImages

COVIDGR

Under a close collaboration with an expert radiologist team of the Hospital Universitario San Cecilio, the COVIDGR-1.0 dataset of patients' anonymized X-ray images has been built. 852 images have been collected following a strict labeling protocol. They are categorized into 426 positive cases and 426 negative cases. Positive images correspond to patients who have been tested positive for COVID-19 using RT-PCR within a time span of at most 24h between the X-ray image and the test. Every image has been taken using the same type of equipment and with the same format: only the posterior-anterior view is considered.

8 papers3 benchmarksImages, Medical

CSD (Collaborative SLAM Dataset)

Comprises 4 different subsets - Flat, House, Priory and Lab - each containing a number of different sequences that can be successfully relocalised against each other.

8 papers2 benchmarks

CUHK-Shadow

Collects shadow images for multiple scenarios and compiled a new dataset of 10,500 shadow images, each with labeled ground-truth mask, for supporting shadow detection in the complex world. The dataset covers a rich variety of scene categories, with diverse shadow sizes, locations, contrasts, and types.

8 papers1 benchmarksImages

Diabetic Retinopathy Detection Dataset

A large scale of retina image dataset.

8 papers0 benchmarks

DOTmark (Discrete Optimal Transport Benchmark)

DOTmark is a benchmark for discrete optimal transport, which is designed to serve as a neutral collection of problems, where discrete optimal transport methods can be tested, compared to one another, and brought to their limits on large-scale instances. It consists of a variety of grayscale images, in various resolutions and classes, such as several types of randomly generated images, classical test images and real data from microscopy.

8 papers0 benchmarksImages

ERA (Event Recognition in Aerial videos)

Consists of 2,864 videos each with a label from 25 different classes corresponding to an event unfolding 5 seconds. The ERA dataset is designed to have a significant intra-class variation and inter-class similarity and captures dynamic events in various circumstances and at dramatically various scales.

8 papers0 benchmarks

FVI (Free-form Video Inpainting)

The Free-Form Video Inpainting dataset is a dataset used for training and evaluation video inpainting models. It consists of 1940 videos from the YouTube-VOS dataset and 12,600 videos from the YouTube-BoundingBoxes.

8 papers0 benchmarksVideos
PreviousPage 169 of 1000Next