Datasets

3,148 machine learning datasets

3,148 dataset results

KETOD (Knowledge-Enriched Task-Oriented Dialogue)

KETOD (Knowledge-Enriched Task-Oriented Dialogue) is a dataset containing system responses designed for enriching task-oriented dialogues with chit-chat based on relevant entity knowledge. There are a total of 5,324 dialogues with enriched system responses.

5 papers0 benchmarksTexts

Instructional-DT (Instr-DT) (Instructional Discourse Treebank)

This discourse treebank includes annotated instructional texts originally assembled at the Information Technology Research Institute, University of Brighton. This dataset contains 176 documents with an average of 32.6 EDUs for a total of 5744 EDUs and 53,250 words.

5 papers4 benchmarksTexts

CrossRE

CrossRE is a cross-domain benchmark for Relation Extraction (RE), which comprises six distinct text domains and includes multi-label annotations. The dataset includes meta-data collected during annotation, to include explanations and flags of difficult instances.

5 papers0 benchmarksTexts

BioNLI (Biomedical Natural Language Inference)

BioNLI is a dataset in biomedical natural language inference. This dataset contains abstracts from biomedical literature and mechanistic premises generated with nine different strategies.

5 papers1 benchmarksTexts

Open Relation Modeling

Given two entities, generating a coherent sentence describing the relation between them.

5 papers0 benchmarksTexts

SpaRTUN

SpaRTUN a dataset synthesized for transfer learning on spatial question answering (SQA) and spatial role labeling (SpRL).

5 papers0 benchmarksTexts

IGLU

IGLU is a dataset designed for interactive grounded language understanding. It has a total of 8,136 single-turn data pairs of instructions and actions. Every single sample is randomly initialized with a pre-built structure from previously collected multi-turn interactions data.

5 papers0 benchmarksTexts

NLPeer

NLPeer is a multidomain corpus of more than 5k papers and 11k review reports from five different venues. In addition to the new datasets of paper drafts, camera-ready versions and peer reviews from the NLP community, this dataset has a unified data representation, and augment previous peer review datasets to include parsed, structured paper representations, rich metadata and versioning information.

5 papers0 benchmarksTexts

RuCoLA

The Russian Corpus of Linguistic Acceptability (RuCoLA) is built from the ground up under the well-established binary LA approach. RuCoLA consists of 9.8k in-domain sentences from linguistic publications and 3.6k out-of-domain sentence produced by generative models.

5 papers2 benchmarksTexts

ComFact

ComFact is a benchmark for commonsense fact linking, where models are given contexts and trained to identify situationally-relevant commonsense knowledge from KGs. The novel benchmark, C-om-Fact, contains ∼293k in-context relevance annotations for common-sense triplets across four stylistically diverse dialogue and storytelling datasets.

5 papers0 benchmarksTexts

ADVETA

ADVErsarial Table perturbAtion (ADVETA) is a robustness evaluation benchmark featuring natural and realistic ATPs. It is based on three mainstream Text-to-SQL datasets, Spider, WikiSQL and WTQ.

5 papers0 benchmarksTexts

Distress Analysis Interview Corpus/Wizard-of-Oz set (DAIC-WOZ)

The Distress Analysis Interview Corpus/Wizard-of-Oz set (DAIC-WOZ) dataset [24, 25] comprises voice and text samples from 189 interviewed healthy and control persons and their PHQ-8 depression detection questionnaire. This dataset is commonly used in research works for text-based detection, voice-based detection, and in multi-modal architecture

5 papers0 benchmarksAudio, Texts, Videos

HaDes

HaDes is a token-level, reference-free hallucination detection dataset named HAllucination DEtection dataSet. To create this dataset, a large number of text segments extracted from English language Wikipedia are perturbed, and then verified these with crowd-sourced annotations.

5 papers0 benchmarksTexts

BB-norm-habitat (Bacteria Biotope - entity normalization - bacterial habitat)

In the BB-norm modality of this task, participant systems had to normalize textual entity mentions according to the OntoBiotope ontology for habitats. See BB-dataset for more information.

5 papers0 benchmarksBiology, Texts

BB-norm-phenotype (Bacteria Biotope - entity normalization - phenotype)

In the BB-norm modality of this task, participant systems had to normalize textual entity mentions according to the OntoBiotope ontology for phenotypes. See BB-dataset for more information.

5 papers0 benchmarksBiology, Texts

Biwi 3D Audiovisual Corpus of Affective Communication - B3D(AC)^2 (BIWI 3D)

BIWI 3D corpus comprises a total of 1109 sentences uttered by 14 native English speakers (6 males and 8 females). A real time 3D scanner and a professional microphone were used to capture the facial movements and the speech of the speakers. The dense dynamic face scans were acquired at 25 frames per second and the RMS error in the 3D reconstruction is about 0.5 mm. In order to ease automatic speech segmentation, we carried out the recordings in a anechoic room, with walls covered by sound wave-absorbing materials.

5 papers14 benchmarks3d meshes, Audio, Texts

BioCoder

BioCoder is a benchmark developed to evaluate existing pre-trained models in generating bioinformatics code. In relation to function-code generation, BioCoder covers potential package dependencies, class declarations, and global variables. It incorporates 1026 functions and 1243 methods in Python and Java from GitHub and 253 examples from the Rosalind Project.

5 papers0 benchmarksTexts

VidChapters-7M

VidChapters-7M is a dataset of 817K user-chaptered videos including 7M chapters in total. VidChapters-7M is automatically created from videos online in a scalable manner by scraping user-annotated chapters and hence without any additional manual annotation. It is designed for training and evaluating models for video chapter generation with or without ground-truth boundaries, and video chapter grounding, as well as for video-language pretraining.

5 papers16 benchmarksTexts, Videos

SOFC-Exp

The SOFC-Exp corpus contains 45 scientific publications about solid oxide fuel cells (SOFCs), published between 2013 and 2019 as open-access articles all with a CC-BY license. The dataset was manually annotated by domain experts.

5 papers0 benchmarksTexts

FLD (Formal Logic Deduction)

A deductive reasoning benchmark based on formal logic theory. A model is required to generate a proof that (dis-) proves a given hypothesis based on a given set of facts.

5 papers0 benchmarksTexts

PreviousPage 64 of 158Next