Datasets

3,148 machine learning datasets

3,148 dataset results

Chinese Gigaword

Chinese Gigaword corpus consists of 2.2M of headline-document pairs of news stories covering over 284 months from two Chinese newspapers, namely the Xinhua News Agency of China (XIN) and the Central News Agency of Taiwan (CNA).

2 papers0 benchmarksTexts

MUSE

The MUSE dataset contains bilingual dictionaries for 110 pairs of languages. For each language pair, the training seed dictionaries contain approximately 5000 word pairs while the evaluation sets contain 1500 word pairs.

2 papers0 benchmarksTexts

Rendered SST2

The Rendered SST2 dataset is a dataset released by OpenAI, that measures the optical character recognition capability of visual representations. It uses sentences from the Stanford Sentiment Treebank dataset and renders them into images, with black texts on a white background, in a 448×448 resolution.

2 papers1 benchmarksImages, Texts

WT-WT (Will-They-Won't-They)

Will-They-Won't-They (WT-WT) is a large dataset of English tweets targeted at stance detection for the rumor verification task. The dataset is constructed based on tweets that discuss five recent merger and acquisition (M&A) operations of US companies, mainly from the healthcare sector.

2 papers0 benchmarksTexts

MUStARD (Multimodal Sarcasm Detection Dataset)

We release the MUStARD dataset which is a multimodal video corpus for research in automated sarcasm discovery. The dataset is compiled from popular TV shows including Friends, The Golden Girls, The Big Bang Theory, and Sarcasmaholics Anonymous. MUStARD consists of audiovisual utterances annotated with sarcasm labels. Each utterance is accompanied by its context, which provides additional information on the scenario where the utterance occurs.

2 papers0 benchmarksImages, Texts

VIST-Edit

The dataset, VIST-Edit, includes 14,905 human-edited versions of 2,981 machine-generated visual stories. The stories were generated by two state-of-the-art visual storytelling models, each aligned to 5 human-edited versions.

2 papers0 benchmarksImages, Texts

X-WikiRE

X-WikiRE is a new, large-scale multilingual relation extraction dataset in which relation extraction is framed as a problem of reading comprehension to allow for generalization to unseen relations.

2 papers0 benchmarksTexts

Dataset of Legal Documents

Dataset of Legal Documents consists of court decisions from 2017 and 2018 were selected for the dataset, published online by the Federal Ministry of Justice and Consumer Protection. The documents originate from seven federal courts: Federal Labour Court (BAG), Federal Fiscal Court (BFH), Federal Court of Justice (BGH), Federal Patent Court (BPatG), Federal Social Court (BSG), Federal Constitutional Court (BVerfG) and Federal Administrative Court (BVerwG).

2 papers0 benchmarksTexts

UCC (Unhealthy Comments Corpus)

The Unhealthy Comments Corpus (UCC) is corpus of 44355 comments intended to assist in research on identifying subtle attributes which contribute to unhealthy conversations online.

2 papers0 benchmarksTexts

NQuAD (Nuclear Question Answering Dataset)

NQuAD is a Nuclear Question Answering Dataset, which contains 700+ nuclear Question Answer pairs developed and verified by expert nuclear researchers.

2 papers0 benchmarksTexts

NYT-H

NYT-H is a dataset for distantly-supervised relation extraction, in which DS-labelled training data is used and several annotators to label test data are hired. NYT-H can serve as a benchmark of distantly-supervised relation extraction.

2 papers0 benchmarksTexts

Circa

The Circa (meaning ‘approximately’) dataset aims to help machine learning systems to solve the problem of interpreting indirect answers to polar questions.

2 papers0 benchmarksTexts

MDD (Movie Dialog dataset)

Movie Dialog dataset (MDD) is designed to measure how well models can perform at goal and non-goal orientated dialog centered around the topic of movies (question answering, recommendation and discussion).

2 papers0 benchmarksTexts

30MQA (30M Factoid Question-Answer Corpus)

An enormous question answer pair corpus produced by applying a novel neural network architecture on the knowledge base Freebase to transduce facts into natural language questions.

2 papers0 benchmarksTexts

APE (Automatic Post-Editing)

APE is useful to evaluate Machine Translation automatic post-editing (APE), which is the task of improving the output of a blackbox MT system by automatically fixing its mistakes. The act of post-editing text can be fully specified as a sequence of delete and insert actions in given positions.

2 papers0 benchmarksTexts

CIC (Catalonia Independence Corpus)

The dataset is annotated with stance towards one topic, namely, the independence of Catalonia.

2 papers0 benchmarksTexts

Climate Claims

The Climate Change Claims dataset for generating fact checking summaries contains claims broadly related to climate change and global warming from climatefeedback.org. It contains 1k documents from 104 different claims from 97 different domains.

2 papers0 benchmarksTexts

Covid-HeRA

Covid-HeRA is a dataset for health risk assessment and severity-informed decision making in the presence of COVID19 misinformation. It is a benchmark dataset for risk-aware health misinformation detection, related to the 2019 coronavirus pandemic. Social media posts (Twitter) are annotated based on the perceived likelihood of health behavioural changes and the perceived corresponding risks from following unreliable advice found online.

2 papers0 benchmarksTexts

CQR (Contextual Query Rewrite)

CQR is an extension to the Stanford Dialogue Corpus. It contains crowd-sourced rewrites to facilitate research in dialogue state tracking using natural language as the interface.

2 papers0 benchmarksTexts

Curiosity

The Curiosity dataset consists of 14K dialogs (with 181K utterances) with fine-grained knowledge groundings, dialog act annotations, and other auxiliary annotation. In this dataset users and virtual assistants converse about geographic topics like geopolitical entities and locations. This dataset is annotated with pre-existing user knowledge, message-level dialog acts, grounding to Wikipedia, and user reactions to messages.

2 papers0 benchmarksTexts

PreviousPage 85 of 158Next