Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets

3,148 machine learning datasets

Filter by Modality

3,148 dataset results

ReactionGIF

ReactionGIF is an affective dataset of 30K tweets which can be used for tasks like induced sentiment prediction and multilabel classification of induced emotions.

2 papers0 benchmarksImages, Texts

POINTREC

POINTREC is a test collection for point of interest (POI) recommendation, comprising of (i) a set of information needs, (ii) a dataset of POIs, and (iii) graded relevance assessments for information need and POI pairs.

2 papers0 benchmarksTexts

Instantiation Dataset

Instantiation is a dataset for the task of instantiation detection

2 papers0 benchmarksTexts

The 'Call me sexist but' Dataset (CMSB)

Tweets and items from psychological scales for sexism detection with counterfactual examples.

2 papers0 benchmarksTexts

Stanford Schema2QA Dataset

Schema2QA is the first large question answering dataset over real-world Schema.org data. It covers 6 common domains: restaurants, hotels, people, movies, books, and music, based on crawled Schema.org metadata from 6 different websites (Yelp, Hyatt, LinkedIn, IMDb, Goodreads, and last.fm.). In total, there are over 2,000,000 examples for training, consisting of both augmented human paraphrase data and high-quality synthetic data generated by Genie. All questions are annotated with executable virtual assistant programming language ThingTalk.

2 papers0 benchmarksTexts

5k_presetation_slides (5000 presentation slide pairs)

We crawled 5000 paper, slide pairs from conference proceeding websites. (e.g. acl.org and usenix.org).

2 papers0 benchmarksTexts

CiteWorth

CiteWorth is a a large, contextualized, rigorously cleaned labelled dataset for cite-worthiness detection built from a massive corpus of extracted plain-text scientific documents.

2 papers0 benchmarksTexts

IndiaPoliceEvents

IndiaPoliceEvents is a corpus of 21,391 sentences from 1,257 English-language Times of India articles about events in the state of Gujarat during March 2002. This dataset is used for automated event extraction.

2 papers0 benchmarksTexts

MultiOpEd

MultiOpEd is a corpus of multi-perspective news editorials. It is an open-domain news editorial corpus that supports various tasks pertaining to the argumentation structure in news editorials, focusing on automatic perspective discovery. News editorial is a genre of persuasive text, where the argumentation structure is usually implicit. However, the arguments presented in an editorial typically center around a concise, focused thesis, which we refer to as their perspective. MultiOpEd aims at supporting the study of multiple tasks relevant to automatic perspective discovery, where a system is expected to produce a single-sentence thesis statement summarizing the arguments presented.

2 papers0 benchmarksTexts

Python Programming Puzzles (P3)

Python Programming Puzzles (P3) is an open-source dataset where each puzzle is defined by a short Python program , and the goal is to find an input which makes output "True". The puzzles are objective in that each one is specified entirely by the source code of its verifier, so evaluating is all that is needed to test a candidate solution. They do not require an answer key or input/output examples, nor do they depend on natural language understanding.

2 papers0 benchmarksTexts

DisKnE (Disease Knowledge Evaluation)

DisKnE is a benchmark for Disease Knowledge Evaluation built from MedNLI and MEDIQA-NLI. This benchmark is constructed to specifically test the medical reasoning capabilities of ML models, such as mapping symptoms to diseases.

2 papers0 benchmarksMedical, Texts

OpenSLR (Open Speech and Language Resources)

OpenSLR is a repository of open speech and language resources, including large-scale transcribed audio corpora and related software. It serves as a central platform for researchers and practitioners to access and share datasets used in speech recognition (ASR), text-to-speech (TTS), and linguistic research.

2 papers0 benchmarksAudio, Texts

Morph Call

Morph Call is a suite of 46 probing tasks for four Indo-European languages that fall under different morphology: Russian, French, English, and German. The tasks are designed to explore the morphosyntactic content of multilingual transformers which is a less studied aspect at the moment.

2 papers0 benchmarksTexts

Amazon-PQA

Amazon-PQA is a product question-answer dataset. The Amazon-PQA dataset includes questions and their answers that are published on Amazon website, along with the public product information and category (Amazon Browse Node name). It contains more than 8M questions from 1M+ products.

2 papers0 benchmarksTexts

Common Crawl

The Common Crawl corpus contains petabytes of data collected over 12 years of web crawling. The corpus contains raw web page data, metadata extracts and text extracts. Common Crawl data is stored on Amazon Web Services’ Public Data Sets and on multiple academic cloud platforms across the world.

2 papers0 benchmarksTexts

SportSett

This resource is designed to allow for research into Natural Language Generation. In particular, with neural data-to-text approaches although it is not limited to these.

2 papers0 benchmarksTabular, Texts

Wikidata-14M

Wikidata-14M is a recommender system dataset for recommending items to Wikidata editors. It consists of 220,000 editors responsible for 14 million interactions with 4 million items.

2 papers0 benchmarksTexts

DUC 2007 (Document Understanding Conferences)

There is currently much interest and activity aimed at building powerful multi-purpose information systems. The agencies involved include DARPA, ARDA and NIST. Their programmes, for example DARPA's TIDES (Translingual Information Detection Extraction and Summarization) programme, ARDA's Advanced Question & Answering Program and NIST's TREC (Text Retrieval Conferences) programme cover a range of subprogrammes. These focus on different tasks requiring their own evaluation designs.

2 papers0 benchmarksTexts

VR traffic traces

The dataset contains traffic traces collected from 3 different VR applications. Researchers can use this dataset to replicate the behavior of real VR traffic directly in their studies, e.g., their simulations. Further information can be found in the repository.

2 papers0 benchmarksTexts, Time series

MobIE

MobIE is a German-language dataset which is human-annotated with 20 coarse- and fine-grained entity types and entity linking information for geographically linkable entities. The dataset consists of 3,232 social media texts and traffic reports with 91K tokens, and contains 20.5K annotated entities, 13.1K of which are linked to a knowledge base. A subset of the dataset is human-annotated with seven mobility-related, n-ary relation types, while the remaining documents are annotated using a weakly-supervised labeling approach implemented with the Snorkel framework.

2 papers0 benchmarksTexts

PreviousPage 90 of 158Next