Datasets

19,997 machine learning datasets

19,997 dataset results

NoReC (Norwegian Review Corpus)

The Norwegian Review Corpus (NoReC) was created for the purpose of training and evaluating models for document-level sentiment analysis. More than 43,000 full-text reviews have been collected from major Norwegian news sources and cover a range of different domains, including literature, movies, video games, restaurants, music and theater, in addition to product reviews across a range of categories. Each review is labeled with a manually assigned score of 1–6, as provided by the rating of the original author.

17 papers0 benchmarksTexts

ORConvQA (Open-Retrieval Conversational Question Answering)

Enhances QuAC by adapting it to an open-retrieval setting. It is an aggregation of three existing datasets: (1) the QuAC dataset that offers information-seeking conversations, (2) the CANARD dataset that consists of context-independent rewrites of QuAC questions, and (3) the Wikipedia corpus that serves as the knowledge source of answering questions.

17 papers0 benchmarks

ParaBank

A large-scale English paraphrase dataset that surpasses prior work in both quantity and quality.

17 papers0 benchmarks

RPC (Retail Product Checkout)

RPC is a large-scale retail product checkout dataset and collects 200 retail SKUs. The collected SKUs can be divided into 17 meta categories, i.e., puffed food, dried fruit, dried food, instant drink, instant noodles, dessert, drink, alcohol, milk, canned food, chocolate, gum, candy, seasoner, personal hygiene, tissue, stationery.

17 papers0 benchmarksImages

VIPL-HR

VIPL-HR database is a database for remote heart rate (HR) estimation from face videos under less-constrained situations. It contains 2,378 visible light videos (VIS) and 752 near-infrared (NIR) videos of 107 subjects. Nine different conditions, including various head movements and illumination conditions are taken into consideration. All the videos are recorded using Logitech C310, RealSense F200 and the front camera of HUAWEI P9 smartphone, and the ground-truth HR is recorded using a CONTEC CMS60C BVP sensor (a FDA approved device).

17 papers10 benchmarksVideos

WikiSection

A publicly available dataset with 242k labeled sections in English and German from two distinct domains: diseases and cities.

17 papers0 benchmarks

Switchboard-1 Corpus

The Switchboard-1 Telephone Speech Corpus (LDC97S62) consists of approximately 260 hours of speech and was originally collected by Texas Instruments in 1990-1, under DARPA sponsorship. The first release of the corpus was published by NIST and distributed by the LDC in 1992-3.

17 papers0 benchmarksSpeech

QAMR (Question-Answer Meaning Representation Dataset)

Question-Answer Meaning Representation (QAMR) represents a predicate-argument structure of a sentence with a set of question-answer pairs, so that annotations can be easily provided by non-experts. QAMR is a dataset of over 5,000 sentences and 100,000 questions created by crowdsourcing workers.

17 papers0 benchmarksTexts

MoNuSeg

The dataset for this challenge was obtained by carefully annotating tissue images of several patients with tumors of different organs and who were diagnosed at multiple hospitals. This dataset was created by downloading H&E stained tissue images captured at 40x magnification from TCGA archive. H&E staining is a routine protocol to enhance the contrast of a tissue section and is commonly used for tumor assessment (grading, staging, etc.). Given the diversity of nuclei appearances across multiple organs and patients, and the richness of staining protocols adopted at multiple hospitals, the training datatset will enable the development of robust and generalizable nuclei segmentation techniques that will work right out of the box.

17 papers7 benchmarksImages, Medical

VQA-E

VQA-E is a dataset for Visual Question Answering with Explanation, where the models are required to generate and explanation with the predicted answer. The VQA-E dataset is automatically derived from the VQA v2 dataset by synthesizing a textual explanation for each image-question-answer triple.

17 papers0 benchmarksImages, Texts

MalNet

MalNet is a large public graph database, representing a large-scale ontology of software function call graphs. MalNet contains over 1.2 million graphs, averaging over 17k nodes and 39k edges per graph, across a hierarchy of 47 types and 696 families.

17 papers1 benchmarksGraphs

Shiny dataset

The shiny folder contains 8 scenes with challenging view-dependent effects used in our paper. We also provide additional scenes in the shiny_extended folder. The test images for each scene used in our paper consist of one of every eight images in alphabetical order.

17 papers0 benchmarksImages

PartialSpoof

PartialSpoof is a dataset of partially-spoofed data to evaluate detection of partially-spoofed speech data. It has been built based on the ASVspoof 2019 LA database since the latter covers 17 types of spoofed data produced by advanced speech synthesizers, voice converters, and hybrids. The authors used the same set of bona fide data from the ASVspoof 2019 LA database but created partially spoofed audio from the ASVspoof 2019 LA data.

17 papers0 benchmarksSpeech

Quasimodo

Quasimodo is commonsense knowledge base that focuses on salient properties of objects. We provide several subsets:

17 papers0 benchmarksTexts

OMG-Emotion (One-Minute Gradual-Emotional Behavior)

The One-Minute Gradual-Emotional Behavior dataset (OMG-Emotion) dataset is composed of Youtube videos which are around a minute in length and are annotated taking into consideration a continuous emotional behavior. The videos were selected using a crawler technique that uses specific keywords based on long-term emotional behaviors such as "monologues", "auditions", "dialogues" and "emotional scenes".

17 papers0 benchmarksVideos

York Urban Line Segment Database

The York Urban Line Segment Database is a compilation of 102 images (45 indoor, 57 outdoor) of urban environments consisting mostly of scenes from the campus of York University and downtown Toronto, Canada. The images are 640 x 480 in size and have been taken with a calibrated Panasonic Lumix DMC-LC80 digital camera.

17 papers0 benchmarksImages

Kleister NDA

Kleister NDA is a dataset for Key Information Extraction (KIE). The dataset contains a mix of scanned and born-digital long formal English-language documents. For this datasets, an NLP system is expected to find or infer various types of entities by employing both textual and structural layout features. The Kleister NDA dataset has 540 Non-disclosure Agreements, with 3,229 unique pages and 2,160 entities to extract.

17 papers1 benchmarksTexts

SECOND (SEmantic Change detectiON Dataset)

SECOND is a well-annotated semantic change detection dataset. To ensure data diversity, we firstly collect 4662 pairs of aerial images from several platforms and sensors. These pairs of images are distributed over the cities such as Hangzhou, Chengdu, and Shanghai. Each image has size 512 x 512 and is annotated at the pixel level. The annotation of SECOND is carried out by an expert group of earth vision applications, which guarantees high label accuracy. For the change category in the SECOND dataset, we focus on 6 main land-cover classes, i.e. , non-vegetated ground surface, tree, low vegetation, water, buildings and playgrounds , that are frequently involved in natural and man-made geographical changes. It is worth noticing that, in the new dataset, non-vegetated ground surface ( n.v.g. surface for short) mainly corresponds to impervious surface and bare land. In summary, these 6 selected land-cover categories result in 30 common change categories (including non-change ). Through the

17 papers3 benchmarksImages

ConvQuestions

ConvQuestions is the first realistic benchmark for conversational question answering over knowledge graphs. It contains 11,200 conversations which can be evaluated over Wikidata. They are compiled from the inputs of 70 Master crowdworkers on Amazon Mechanical Turk, with conversations from five domains: Books, Movies, Soccer, Music, and TV Series. The questions feature a variety of complex question phenomena like comparisons, aggregations, compositionality, and temporal reasoning. Answers are grounded in Wikidata entities to enable fair comparison across diverse methods. The data gathering setup was kept as natural as possible, with the annotators selecting entities of their choice from each of the five domains, and formulating the entire conversation in one session. All questions in a conversation are from the same Turker, who also provided gold answers to the questions. For suitability to knowledge graphs, questions were constrained to be objective or factoid in nature, but no other r

17 papers0 benchmarksTexts

HPO-B

HPO-B is a benchmark for assessing the performance of HPO (Hyperparameter optimization) algorithms.

17 papers0 benchmarks

PreviousPage 114 of 1000Next