Datasets

19,997 machine learning datasets

19,997 dataset results

EMDB

EMDB contains in-the-wild videos of human activity recorded with a hand-held iPhone. It features reference SMPL body pose and shape parameters, as well as global body root and camera trajectories. The reference 3D poses were obtained by jointly fitting SMPL to 12 body-worn electromagnetic sensors and image data. For the latter we fit a neural implicit avatar model to allow for a dense pixel-wise fitting objective.

32 papers36 benchmarks3D, Images, RGB Video, Videos

M3Exam

M3Exam is a multilingual, multimodal, and multilevel benchmark designed for evaluating Large Language Models (LLMs). Unlike traditional benchmarks, which often focus on specific tasks or datasets, M3Exam takes a more comprehensive approach by sourcing real and official human exam questions. Let's delve into its unique characteristics:

32 papers0 benchmarks

P-Stance

P-Stance: A Large Dataset for Stance Detection in Political Domain 2021

32 papers1 benchmarksTexts

SUN (SUN Database)

When glancing at a magazine, or browsing the Internet, we are continuously being exposed to photographs. Despite of this overflow of visual information, humans are extremely good at remembering thousands of pictures along with some of their visual details. But not all images are equal in memory. Some stitch to our minds, and other are forgotten. In this paper we focus on the problem of predicting how memorable an image will be. We show that memorability is a stable property of an image that is shared across different viewers. We introduce a database for which we have measured the probability that each picture will be remembered after a single view. We analyze image features and labels that contribute to making an image memorable, and we train a predictor based on global image descriptors. We find that predicting image memorability is a task that can be addressed with current computer vision techniques. Whereas making memorable images is a challenging task in visualization and photograp

31 papers5 benchmarksImages

MSRC-12 (MSRC-12 Kinect Gesture Dataset)

The Microsoft Research Cambridge-12 Kinect gesture data set consists of sequences of human movements, represented as body-part locations, and the associated gesture to be recognized by the system. The data set includes 594 sequences and 719,359 frames—approximately six hours and 40 minutes—collected from 30 people performing 12 gestures. In total, there are 6,244 gesture instances. The motion files contain tracks of 20 joints estimated using the Kinect Pose Estimation pipeline. The body poses are captured at a sample rate of 30Hz with an accuracy of about two centimeters in joint positions.

31 papers9 benchmarks3D, Images

TIMIT (TIMIT Acoustic-Phonetic Continuous Speech Corpus)

The TIMIT Acoustic-Phonetic Continuous Speech Corpus is a standard dataset used for evaluation of automatic speech recognition systems. It consists of recordings of 630 speakers of 8 dialects of American English each reading 10 phonetically-rich sentences. It also comes with the word and phone-level transcriptions of the speech.

31 papers1 benchmarksSpeech, Texts

ASTD (Arabic Sentiment Tweets Dataset)

Arabic Sentiment Tweets Dataset (ASTD) is an Arabic social sentiment analysis dataset gathered from Twitter. It consists of about 10,000 tweets which are classified as objective, subjective positive, subjective negative, and subjective mixed.

31 papers1 benchmarks

OIE2016

OIE2016 is the first large-scale OpenIE benchmark. It is created by automatic conversion from QA-SRL [He et al., 2015], a semantic role labeling dataset. The sentences are from news (e.g., WSJ) and encyclopedia (e.g., WIKI) domains. Since there are no restrictions on the elements of OpenIE extractions, partial-matching criteria instead of exact-matching is typically used. Hence, the evaluation script can tolerate the extractions that are slightly different from the gold annotation.

31 papers2 benchmarks

SOC (Salient Objects in Clutter)

SOC (Salient Objects in Clutter) is a dataset for Salient Object Detection (SOD). It includes images with salient and non-salient objects from daily object categories. Beyond object category annotations, each salient image is accompanied by attributes that reflect common challenges in real-world scenes.

31 papers18 benchmarksImages

Multi-dSprites

31 papers1 benchmarks

Image-Chat

The IMAGE-CHAT dataset is a large collection of (image, style trait for speaker A, style trait for speaker B, dialogue between A & B) tuples that we collected using crowd-workers, Each dialogue consists of consecutive turns by speaker A and B. No particular constraints are placed on the kinds of utterance, only that we ask the speakers to both use the provided style trait, and to respond to the given image and dialogue history in an engaging way. The goal is not just to build a diagnostic dataset but a basis for training models that humans actually want to engage with.

31 papers9 benchmarksImages, Texts

CDD Dataset (season-varying)

Source: CHANGE DETECTION IN REMOTE SENSING IMAGES USING CONDITIONAL ADVERSARIAL NETWORKS

31 papers10 benchmarks

MQ2008

The MQ2008 dataset is a dataset for Learning to Rank. It contains 800 queries with labelled documents.

31 papers0 benchmarksRanking, Texts

GuitarSet

GuitarSet is a dataset of high-quality guitar recordings and rich annotations. It contains 360 excerpts 30 seconds in length. The 360 excerpts are the result of the following combinations:

31 papers2 benchmarksAudio

Friedman1

The friedman1 data set is commonly used to test semi-supervised regression methods.

31 papers0 benchmarksTabular

DCASE 2016

DCASE 2016 is a dataset for sound event detection. It consists of 20 short mono sound files for each of 11 sound classes (from office environments, like clearthroat, drawer, or keyboard), each file containing one sound event instance. Sound files are annotated with event on- and offset times, however silences between actual physical sounds (like with a phone ringing) are not marked and hence “included” in the event.

31 papers0 benchmarksAudio

WCEP (Wikipedia Current Events Portal)

The WCEP dataset for multi-document summarization (MDS) consists of short, human-written summaries about news events, obtained from the Wikipedia Current Events Portal (WCEP), each paired with a cluster of news articles associated with an event. These articles consist of sources cited by editors on WCEP, and are extended with articles automatically obtained from the Common Crawl News dataset.

31 papers6 benchmarksTexts

Deep Fashion3D

A novel benchmark and dataset for the evaluation of image-based garment reconstruction systems. Deep Fashion3D contains 2078 models reconstructed from real garments, which covers 10 different categories and 563 garment instances. It provides rich annotations including 3D feature lines, 3D body pose and the corresponded multi-view real images. In addition, each garment is randomly posed to enhance the variety of real clothing deformations.

31 papers0 benchmarks

Dreaddit

Consists of 190K posts from five different categories of Reddit communities.

31 papers0 benchmarks

ExPose (EXpressive POse and Shape rEgression)

Curates a dataset of SMPL-X fits on in-the-wild images.

31 papers0 benchmarks

PreviousPage 78 of 1000Next