TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

19,997 machine learning datasets

Filter by Modality

  • Images3,275
  • Texts3,148
  • Videos1,019
  • Audio486
  • Medical395
  • 3D383
  • Time series298
  • Graphs285
  • Tabular271
  • Speech199
  • RGB-D192
  • Environment148
  • Point cloud135
  • Biomedical123
  • LiDAR95
  • RGB Video87
  • Tracking78
  • Biology71
  • Actions68
  • 3d meshes65
  • Tables52
  • Music48
  • EEG45
  • Hyperspectral images45
  • Stereo44
  • MRI39
  • Physics32
  • Interactive29
  • Dialog25
  • Midi22
  • 6D17
  • Replay data11
  • Financial10
  • Ranking10
  • Cad9
  • fMRI7
  • Parallel6
  • Lyrics2
  • PSG2

19,997 dataset results

JNLPBA

JNLPBA is a biomedical dataset that comes from the GENIA version 3.02 corpus (Kim et al., 2003). It was created with a controlled search on MEDLINE. From this search 2,000 abstracts were selected and hand annotated according to a small taxonomy of 48 classes based on a chemical classification. 36 terminal classes were used to annotate the GENIA corpus.

20 papers2 benchmarksTexts

YAGO3-10 (Yet Another Great Ontology 3-10)

YAGO3-10 is benchmark dataset for knowledge base completion. It is a subset of YAGO3 (which itself is an extension of YAGO) that contains entities associated with at least ten different relations. In total, YAGO3-10 has 123,182 entities and 37 relations and 1,179,040 triples, and most of the triples describe attributes of persons such as citizenship, gender, and profession.

20 papers5 benchmarks

AQA-7

Consists of 1106 action samples from seven actions with quality scores as measured by expert human judges.

20 papers2 benchmarksAudio, Videos

Tweebank

Briefly describe the dataset. Provide:

20 papers3 benchmarks

iKala

The iKala dataset is a singing voice separation dataset that comprises of 252 30-second excerpts sampled from 206 iKala songs (plus 100 hidden excerpts reserved for MIREX data mining contest). The music accompaniment and the singing voice are recorded at the left and right channels respectively. Additionally, the human-labeled pitch contours and timestamped lyrics are provided.

20 papers1 benchmarksAudio, Lyrics

George Washington

The George Washington dataset contains 20 pages of letters written by George Washington and his associates in 1755 and thereby categorized into historical collection. The images are annotated at word level and contain approximately 5,000 words.

20 papers0 benchmarksImages, Texts

RoboCup

RoboCup is an initiative in which research groups compete by enabling their robots to play football matches. Playing football requires solving several challenging tasks, such as vision, motion, and team coordination. Framing the research efforts onto football attracts public interest (and potential research funding) in robotics, which may otherwise be less entertaining to non-experts.

20 papers0 benchmarksTexts

CLOTH (CLOze test by TeacHers)

The Cloze Test by Teachers (CLOTH) benchmark is a collection of nearly 100,000 4-way multiple-choice cloze-style questions from middle- and high school-level English language exams, where the answer fills a blank in a given text. Each question is labeled with a type of deep reasoning it involves, where the four possible types are grammar, short-term reasoning, matching/paraphrasing, and long-term reasoning, i.e., reasoning over multiple sentences

20 papers0 benchmarksTexts

PubMed RCT (PubMed 200k RCT)

PubMed 200k RCT is new dataset based on PubMed for sequential sentence classification. The dataset consists of approximately 200,000 abstracts of randomized controlled trials, totaling 2.3 million sentences. Each sentence of each abstract is labeled with their role in the abstract using one of the following classes: background, objective, method, result, or conclusion. The purpose of releasing this dataset is twofold. First, the majority of datasets for sequential short-text classification (i.e., classification of short texts that appear in sequences) are small: the authors hope that releasing a new large dataset will help develop more accurate algorithms for this task. Second, from an application perspective, researchers need better tools to efficiently skim through the literature. Automatically classifying each sentence in an abstract would help researchers read abstracts more efficiently, especially in fields where abstracts may be long, such as the medical field.

20 papers0 benchmarksTexts

PASCAL VOC 2011

PASCAL VOC 2011 is an image segmentation dataset. It contains around 2,223 images for training, consisting of 5,034 objects. Testing consists of 1,111 images with 2,028 objects. In total there are over 5,000 precisely segmented objects for training.

20 papers2 benchmarksImages

QuaRel

QuaRel is a crowdsourced dataset of 2771 multiple-choice story questions, including their logical forms.

20 papers0 benchmarksTexts

EORSSD (Extended Optical Remote Sensing Saliency Detection)

The Extended Optical Remote Sensing Saliency Detection (EORSSD) dataset is an extension of the ORSSD dataset. This new dataset is larger and more varied than the original. It contains 2,000 images and corresponding pixel-wise ground truth, which includes many semantically meaningful but challenging images.

20 papers0 benchmarksImages

ETHOS (multi-labEl haTe speecH detectiOn dataSet)

ETHOS is a hate speech detection dataset. It is built from YouTube and Reddit comments validated through a crowdsourcing platform. It has two subsets, one for binary classification and the other for multi-label classification. The former contains 998 comments, while the latter contains fine-grained hate-speech annotations for 433 comments.

20 papers0 benchmarksTexts

FGADR

This dataset has 1,842 images with pixel-level DR-related lesion annotations, and 1,000 images with image-level labels graded by six board-certified ophthalmologists with intra-rater consistency. The proposed dataset will enable extensive studies on DR diagnosis.

20 papers0 benchmarksImages, Medical

MLQE-PE (Multilingual Quality Estimation and Automatic Post-editing Dataset)

The Multilingual Quality Estimation and Automatic Post-editing (MLQE-PE) Dataset is a dataset for Machine Translation (MT) Quality Estimation (QE) and Automatic Post-Editing (APE). The dataset contains seven language pairs, with human labels for 9,000 translations per language pair in the following formats: sentence-level direct assessments and post-editing effort, and word-level good/bad labels. It also contains the post-edited sentences, as well as titles of the articles where the sentences were extracted from, and the neural MT models used to translate the text.

20 papers0 benchmarksTexts

Obstacle Tower

Obstacle Tower is a high fidelity, 3D, 3rd person, procedurally generated environment for reinforcement learning. An agent playing Obstacle Tower must learn to solve both low-level control and high-level planning problems in tandem while learning from pixels and a sparse reward signal. Unlike other benchmarks such as the Arcade Learning Environment, evaluation of agent performance in Obstacle Tower is based on an agent’s ability to perform well on unseen instances of the environment.

20 papers0 benchmarks3D, Environment

PlantDoc

PlantDoc is a dataset for visual plant disease detection. The dataset contains 2,598 data points in total across 13 plant species and up to 17 classes of diseases, involving approximately 300 human hours of effort in annotating internet scraped images.

20 papers2 benchmarks

PolitiFact

Fact-checking (FC) articles which contains pairs (multimodal tweet and a FC-article) from politifact.com.

20 papers1 benchmarks

PreCo

A large-scale English dataset for coreference resolution. The dataset is designed to embody the core challenges in coreference, such as entity representation, by alleviating the challenge of low overlap between training and test sets and enabling separated analysis of mention detection and mention clustering.

20 papers1 benchmarksTexts

RCTW-17 (Reading Chinese Text in the Wild)

Features a large-scale dataset with 12,263 annotated images. Two tasks, namely text localization and end-to-end recognition, are set up. The competition took place from January 20 to May 31, 2017. 23 valid submissions were received from 19 teams.

20 papers0 benchmarksTexts
PreviousPage 102 of 1000Next