Datasets

19,997 machine learning datasets

19,997 dataset results

TvSum (TVSum: Summarizing Web Videos Using Titles)

Introduced by Song et al. in TVSum: Summarizing web videos using titles.

MR (MR Movie Reviews)

MR Movie Reviews is a dataset for use in sentiment-analysis experiments. Available are collections of movie-review documents labeled with respect to their overall sentiment polarity (positive or negative) or subjective rating (e.g., "two and a half stars") and sentences labeled with respect to their subjectivity status (subjective or objective) or polarity.

28 papers6 benchmarksTexts

Resume NER

Resume contains eight fine-grained entity categories -score from 74.5% to 86.88%.

28 papers3 benchmarksTexts

SBU / SBU-Refine (SBU-Kinect-Interaction dataset v2.0)

SBU-Kinect-Interaction dataset version 2.0 comprises of RGB-D video sequences of humans performing interaction activities that are recording using the Microsoft Kinect sensor. This dataset was originally recorded for a class project, and it must be used only for the purposes of research. If you use this dataset in your work, please cite the following paper. Kiwon Yun, Jean Honorio, Debaleena Chattopadhyay, Tamara L. Berg, and Dimitris Samaras, The 2nd International Workshop on Human Activity Understanding from 3D Data at Conference on Computer Vision and Pattern Recognition (HAU3D-CVPRW), CVPR 2012 SBU-Refine: SBU-Refine relabels the test set manually and refines the noise labels in training set by algorithm. H. Yang, T. Wang, X. Hu, and C.-W. Fu, “SILT: Shadow-aware iterative label tuning for learning to detect shadows from noisy labels,” in ICCV, 2023, pp. 12 687–12 698.

28 papers16 benchmarksActions, RGB-D

G3D (Gaming 3D Dataset)

The Gaming 3D Dataset (G3D) focuses on real-time action recognition in a gaming scenario. It contains 10 subjects performing 20 gaming actions: “punch right”, “punch left”, “kick right”, “kick left”, “defend”, “golf swing”, “tennis swing forehand”, “tennis swing backhand”, “tennis serve”, “throw bowling ball”, “aim and fire gun”, “walk”, “run”, “jump”, “climb”, “crouch”, “steer a car”, “wave”, “flap” and “clap”.

28 papers0 benchmarks3D, Images, Videos

MannequinChallenge

The MannequinChallenge Dataset (MQC) provides in-the-wild videos of people in static poses while a hand-held camera pans around the scene. The dataset consists of three splits for training, validation and testing.

28 papers0 benchmarksImages, Videos

EVALution

EVALution dataset is evenly distributed among the three classes (hypernyms, co-hyponyms and random) and involves three types of parts of speech (noun, verb, adjective). The full dataset contains a total of 4,263 distinct terms consisting of 2,380 nouns, 958 verbs and 972 adjectives.

28 papers0 benchmarksTexts

MuseData

MuseData is an electronic library of orchestral and piano classical music from CCARH. It consists of around 3MB of 783 files.

28 papers0 benchmarksAudio

CREMA-D

CREMA-D is an emotional multimodal actor data set of 7,442 original clips from 91 actors. These clips were from 48 male and 43 female actors between the ages of 20 and 74 coming from a variety of races and ethnicities (African America, Asian, Caucasian, Hispanic, and Unspecified).

28 papers44 benchmarksAudio

How2QA

To collect How2QA for video QA task, the same set of selected video clips are presented to another group of AMT workers for multichoice QA annotation. Each worker is assigned with one video segment and asked to write one question with four answer candidates (one correctand three distractors). Similarly, narrations are hidden from the workers to ensure the collected QA pairs are not biased by subtitles. Similar to TVQA, the start and end points are provided for the relevant moment for each question. After filtering low-quality annotations, the final dataset contains 44,007 QA pairs for 22k 60-second clips selected from 9035 videos.

28 papers2 benchmarksTexts, Videos

CelebA-Spoof

CelebA-Spoof is a large-scale face anti-spoofing dataset with the following properties:

28 papers0 benchmarksImages

COVID-CT

Contains 349 COVID-19 CT images from 216 patients and 463 non-COVID-19 CTs. The utility of this dataset is confirmed by a senior radiologist who has been diagnosing and treating COVID-19 patients since the outbreak of this pandemic.

28 papers0 benchmarks

DuRecDial

A human-to-human Chinese dialog dataset (about 10k dialogs, 156k utterances), which contains multiple sequential dialogs for every pair of a recommendation seeker (user) and a recommender (bot).

28 papers0 benchmarksTexts

Dynamic FAUST

Dynamic FAUST extends the FAUST dataset to dynamic 4D data. It consists of high-resolution 4D scans of human subjects in motion, captured at 60 fps.

28 papers1 benchmarks3D, Videos

eSCAPE

Consists of millions of entries in which the MT element of the training triplets has been obtained by translating the source side of publicly-available parallel corpora, and using the target side as an artificial human post-edit. Translations are obtained both with phrase-based and neural models.

28 papers0 benchmarks

EyeQ

Dataset with 28,792 retinal images from the EyePACS dataset, based on a three-level quality grading system (i.e., Good',Usable' and `Reject') for evaluating RIQA methods.

28 papers0 benchmarks

MedICaT

MedICaT is a dataset of medical images, captions, subfigure-subcaption annotations, and inline textual references. Figures and captions are extracted from open access articles in PubMed Central and corresponding reference text is derived from S2ORC. The dataset consists of: 217,060 figures from 131,410 open access papers 7507 subcaption and subfigure annotations for 2069 compound figures Inline references for ~25K figures in the ROCO dataset

28 papers0 benchmarksImages, Medical

MVSEC (Multi Vehicle Stereo Event Camera)

The Multi Vehicle Stereo Event Camera (MVSEC) dataset is a collection of data designed for the development of novel 3D perception algorithms for event based cameras. Stereo event data is collected from car, motorbike, hexacopter and handheld data, and fused with lidar, IMU, motion capture and GPS to provide ground truth pose and depth images.

28 papers5 benchmarksImages, LiDAR, Stereo

decaNLP (Natural Language Decathlon Benchmark)

Natural Language Decathlon Benchmark (decaNLP) is a challenge that spans ten tasks: question answering, machine translation, summarization, natural language inference, sentiment analysis, semantic role labeling, zero-shot relation extraction, goal-oriented dialogue, semantic parsing, and commonsense pronoun resolution. The tasks as cast as question answering over a context.

28 papers0 benchmarksTexts

OASIS (Open Annotations of Single Image Surfaces)

A dataset for single-image 3D in the wild consisting of annotations of detailed 3D geometry for 140,000 images.

28 papers4 benchmarks3D, Images

PreviousPage 83 of 1000Next