Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

3,148 machine learning datasets

Filter by Modality

3,148 dataset results

WordNet-feelings

WordNet-feelings, is an affective dataset that identifies 3664 word senses as feelings, and associates each of these with one of the 9 categories of feeling. The 9 different categories are: Actions, Anger, Attention, Attraction, Hedonics, Other, Physiological, Social, Wellbeing.

1 papers0 benchmarksTexts

TurkQA

TurkQA consists of a selection of sentences from English Wikipedia articles, with questions and answers crowdsourced from workers on Amazon Mechanical Turk.

1 papers0 benchmarksTexts

Dialog-based Language Learning dataset

Dialog-based Language Learning dataset is designed to measure how well models can perform at learning as a student given a teacher’s textual responses to the student’s answer (as well as potentially receiving an external real-valued reward signal).

1 papers0 benchmarksTexts

WikiSuggest

To collect WikiSuggest, Google Suggest API is used to harvest natural language questions and submit them to Google Search. Whenever Google Search returns a box with a short answer from Wikipedia, an example from the question, answer, and the Wikipedia document are created. If the answer string is missing from the document this often implies a spurious question-answer pair, such as (‘what time is half time in rugby’, ‘80 minutes, 40 minutes’). Question-answer pairs without the exact answer string are pruned. Fifty examples after filtering are examined and 54% were found to be well-formed question-answer pairs where answers in the document can be grounded, 20% contained answers without textual evidence in the document (the answer string exists in an irreleveant context), and 26% contain incorrect QA pairs.

1 papers0 benchmarksTexts

AAVE/SAE Paired Dataset

AAVE/SAE Paired Dataset contains 2019 intent-equivalent AAVE/SAE pairs. The AAVE (African-American Vernacular English) samples are sampled from Blodgett et. al. (2016)'s TwitterAAE, with their corresponding SAE (Standard American English) samples annotated by Amazon MTurk.

1 papers0 benchmarksTexts

Advice Seeking Questions

The Advice-Seeking Questions (ASQ) dataset is a collection of personal narratives with advice-seeking questions. The dataset has been split into train, test, heldout sets, with 8865, 2500, 10000 test instances each. This dataset is used to train and evaluate methods that can infer what is the advice-seeking goal behind a personal narrative. This task is formulated as a cloze test, where the goal is to identify which of two advice-seeking questions was removed from a given narrative.

1 papers0 benchmarksTexts

AGRR-2019

Consists of 7.5k sentences with gapping (as well as 15k relevant negative sentences) and comprises data from various genres: news, fiction, social media and technical texts. The dataset was prepared for the Automatic Gapping Resolution Shared Task for Russian (AGRR-2019) - a competition aimed at stimulating the development of NLP tools and methods for processing of ellipsis.

1 papers0 benchmarksTexts

Alexa Point of View

The Alexa Point of View dataset is point of view conversion dataset, a parallel corpus of messages spoken to a virtual assistant and the converted messages for delivery. The dataset contains parallel corpus of input (input column) message and POV converted messages (output column). An example of a pair is tell @CN@ that i'll be late [\t] hi @CN@, @SCN@ would like you to know that they'll be late. The input and pov-converted output pair is tab separated. @CN@ tag is a placeholder for the contact name (receiver) and @SCN@ tag is a placeholder for source contact name (sender). The total dataset has 46563 pairs. This data is then test/train/dev split into 6985 pairs/32594 pairs/6985 pairs.

1 papers1 benchmarksTexts

ApartmenTour

Contains a large number of online videos and subtitles.

1 papers0 benchmarksTexts, Videos

Armenian Paraphrase Detection Corpus

This dataset contains 2,360 paraphrases in Armenian that can be used for paraphrase detection. The dataset is constructed by back-translating sentences from Armenian to English twice, and manually filtering the result.

1 papers0 benchmarksTexts

AskParents

AskParents is a dataset for advice classification extracted from Reddit. In this dataset, posts are annotated for whether they contain advice or not. It contains 8,701 samples for training, 802 for validation and 1,091 for testing.

1 papers0 benchmarksTexts

Automating Dynamic Consent

This dataset is used to evaluate a predictive consent model for users’ information shared in social media. In this task, the goal is to predict whether the users will give their consent to share that data with different hypothetical audiences within a medical context. The dataset is built from information the users posted on Facebook and their consent answers about each piece of information.

1 papers0 benchmarksTexts

AuxAD

AuxAD is a a distantly supervised dataset for acronym disambiguation.

1 papers0 benchmarksTexts

BCWS (Bilingual Contextual Word Similarity)

Dataset for evaluating English-Chinese Bilingual Contextual Word Similarity. The dataset consists of 2,091 English-Chinese word pairs with the corresponding sentential contexts and their similarity scores annotated by the human.

1 papers0 benchmarksTexts

Bianet

Bianet is a parallel news corpus in Turkish, Kurdish and English It contains 3,214 Turkish articles with their sentence-aligned Kurdish or English translations from the Bianet online newspaper.

1 papers0 benchmarksTexts

CECW (Colorful Extended Cleanup World)

The CECW dataset is a color-extended version of the Cleanup World (CW) borrowed from the mobile-manipulation robot domain. CW refers to a world equipped with a movable object as well as four rooms in four colors, including "blue," "green," "red," and "yellow," which is designed as a simulation environment where the agent can act based on the instructions received. CW obeys a particular Geometric Linear Temporal Logic (GLTL) to parse commands by grammatical syntax, resulting in a total of 3,382 commands reflecting 39 GLTL expressions.

1 papers0 benchmarksTexts

Chinese AI and Law (CAIL) 2018

Large-scale Chinese legal dataset for judgment prediction. \dataset contains more than 2.6 million criminal cases published by the Supreme People's Court of China, which are several times larger than other datasets in existing works on judgment prediction.

1 papers0 benchmarksTexts

Chinese Literature NER RE

Chinese Literature NER RE is a Discourse-Level Named Entity Recognition and Relation Extraction Dataset for Chinese Literature Text. It is constructed from hundreds of Chinese literature articles.

1 papers0 benchmarksTexts

CodeSwitch-Reddit

A diverse dataset of written code-switched productions, curated from topical threads of multiple bilingual communities on the Reddit discussion platform, and explore questions that were mainly addressed in the context of spoken language thus far.

1 papers0 benchmarksTexts

Composed Quora

The Composed Quora dataset consists of questions extracted from Quora that are grouped together if they are asking the same thing. The dataset contains 60,400 groups of questions, each group with at least 3 questions that are asking the same.

1 papers0 benchmarksTexts

PreviousPage 103 of 158Next