Datasets

3,148 machine learning datasets

3,148 dataset results

Crypto related tweets from 10.10.2020 to 3.3.2021

The dataset contains 30 million cryptocurrency-related tweets from 10.10.2020 to 3.3.2021. See https://github.com/meakbiyik/ask-who-not-what for more details.

1 papers0 benchmarksTabular, Texts

RoomEnv-v1 (The Room environment - v1)

The Room environment - v1

1 papers1 benchmarksGraphs, Texts

Fallout New Vegas Dialog

Fallout New Vegas Dialog is a multilingual sentiment annotated dialog dataset from Fallout New Vegas. The game developers have preannotated every line of dialog in the game in one of the 8 different sentiments: anger, disgust, fear, happy, neutral, pained, sad and surprised and they have been translated into 5 different languages: English, Spanish, German, French and Italian.

1 papers0 benchmarksDialog, Texts

ScanEnts3D

Scan Entities in 3D (ScanEnts3D) is a large-scale dataset which provides explicit correspondences between 369k objects across 84k natural referentural sentences, covering 705 real-world scenes.

1 papers0 benchmarks3D, Texts

MiST

MiST (Modals In Scientific Text) is a dataset containing 3737 modal instances in five scientific domains annotated for their semantic, pragmatic, or rhetorical function.

1 papers0 benchmarksTexts

FreCDo (French cross-domain)

FreCDo is a corpus for French dialect identification comprising 413,522 French text samples collected from public news websites in Belgium, Canada, France and Switzerland.

1 papers0 benchmarksTexts

Robust Summarization Evaluation Benchmark

Robust Summarization Evaluation Benchmark is a large human evaluation dataset consisting of over 22k summary-level annotations over state-of-the-art systems on three datasets.

1 papers0 benchmarksTexts

FETA Car-Manuals (FETA Car-Manuals dataset, image-text retrieval for foundation models' expert data performance.)

FETA benchmark focuses on text-to-image and image-to-text retrieval in public car manuals and sales catalogue brochures. The FETA Car-Manuals dataset consists of a total of 349 PDF documents from 5 car manufacturers, namely Nissan, Toyota, Mazda, Renault, Chevrolet.

1 papers6 benchmarksImages, Texts

FETA IKEA

FETA benchmark focuses on text-to-image and image-to-text retrieval in public car manuals and sales catalogue brochures. The FETA IKEA dataset contains 26 documents with 7366 pages total, approximately 9574 images and 23927 texts automatically extracted from those pages.

1 papers0 benchmarksImages, Texts

Verifee

Verifee is a dataset of news articles with fine-grained trustworthiness annotations. It contains over 10, 000 unique articles from almost 60 Czech online news sources. These are categorized into one of the 4 classes across the credibility spectrum we propose, raging from entirely trustworthy articles all the way to the manipulative ones.

1 papers0 benchmarksTexts

SimpEvalASSET

SimpEvalASSET is a dataset for learning learnable metrics using modern language models. It comprises of 12K human ratings on 2.4K simplifications of 24 systems, and SIMPEVAL_2022, a challenging simplification benchmark consisting of over 1K human ratings of 360 simplifications including generations from GPT-3.5.

1 papers0 benchmarksTexts

Cards Against Humanity

A dataset of games played in the card game "Cards Against Humanity" (CAH), by human players, derived from the online CAH labs. Each round includes the cards presented to users - a "black" prompt with a blank or question and 10 "white" punchlines as possible responses, and which punchline was picked by a player each round, along with text and metadata.

1 papers0 benchmarksTexts

Skit-S2I (Skit-S2I: An Indian Accented Speech to Intent dataset)

This dataset for Intent classification from human speech covers 14 coarse-grained intents from the Banking domain. This work is inspired by a similar release in the Minds-14 dataset - here, we restrict ourselves to Indian English but with a much larger training set. The data was generated by 11 (Indian English) speakers and recorded over a telephony line. We also provide access to anonymized speaker information - like gender, languages spoken, and native language - to allow more structured discussions around robustness and bias in the models you train.

1 papers0 benchmarksAudio, Texts

MENYO-20k

MENYO-20k is the first multi-domain parallel corpus with a special focus on clean orthography for Yorùbá--English with standardized train-test splits for benchmarking.

1 papers0 benchmarksParallel, Texts

Probability words NLI (Natural language inference with words estimative of probability (WEP))

This dataset tests the capabilities of language models to correctly capture the meaning of words denoting probabilities (WEP), e.g. words like "probably", "maybe", "surely", "impossible".

1 papers1 benchmarksTexts

abc_cc (ABC CC)

Dataset Summary The dataset used to train and evaluate TunesFormer is collected from two sources: The Session and ABCnotation.com. The Session is a community website focused on Irish traditional music, while ABCnotation.com is a website that provides a standard for folk and traditional music notation in the form of ASCII text files. The combined dataset consists of 285,449 ABC tunes, with 99\% (282,595) of the tunes used as the training set and the remaining 1\% (2854) used as the evaluation set.

1 papers0 benchmarksTexts

AviationQA

AviationQA is introduced in the paper titled- There is No Big Brother or Small Brother: Knowledge Infusion in Language Models for Link Prediction and Question Answering

1 papers1 benchmarksTexts

ParagraphOrdreing

We have prepared a dataset, ParagraphOrdreing, which consists of around 300,000 paragraph pairs. We collected our data from Project Gutenberg. We have written an API for gathering and pre-processing in order to have the appropriate format for the defined task. Each example contains two paragraphs and a label that determines whether the second paragraph comes really after the first paragraph (true order with label 1) or the order has been reversed.

1 papers0 benchmarksTexts

ValNov Subtask B

Validity and Novelty are determined in a comparative setting between two conclusions at a time. For Validity and Novelty possible labels are "Conclusion 1 is better", "tie" and "Conclusion 2 is better", for Validity and Novelty respectively.

1 papers12 benchmarksTexts

InstructPix2Pix Image Editing Dataset

A dataset for image editing containing >450k samples of:

1 papers0 benchmarksImages, Texts

PreviousPage 124 of 158Next