19,997 machine learning datasets
19,997 dataset results
OPOSUM is a dataset for the training and evaluation of Opinion Summarization models which contains Amazon reviews from six product domains: Laptop Bags, Bluetooth Headsets, Boots, Keyboards, Televisions, and Vacuums. The six training collections were created by downsampling from the Amazon Product Dataset introduced in McAuley et al. (2015) and contain reviews and their respective ratings.
ForecastQA is a question-answering dataset consisting of 10,392 event forecasting questions, which have been collected and verified via crowdsourcing efforts. The forecasting problem for this dataset is formulated as a restricted-domain, multiple-choice, question-answering (QA) task that simulates the forecasting scenario.
The Spotify Music Streaming Sessions Dataset (MSSD) consists of 160 million streaming sessions with associated user interactions, audio features and metadata describing the tracks streamed during the sessions, and snapshots of the playlists listened to during the sessions.
The Chinese Academy of Sciences Micro-Expression dataset (CASME II) consists of 255 videos, elicited from 26 participants. The videos are recorded using Point Gray GRAS-03K2C camera which has a frame rate of 200fps. The average video length is 0.34s, equivalent to 68 frames. Each video’s emotion label is annotated by two coders, where the reliability is 0.846.
The Atari Grand Challenge dataset is a large dataset of human Atari 2600 replays. It consists of replays for 5 different games: * Space Invaders (445 episodes, 2M frames) * Q*bert (659 episodes, 1.6M frames) * Ms.Pacman (384 episodes, 1.7M frames) * Video Pinball (211 episodes, 1.5M frames) * Montezuma’s revenge (668 episodes, 2.7M frames)
An annotated dataset of ~50K news that can be used for building automated fake news detection systems for a low resource language like Bangla.
A new challenging dataset that can be used for many pattern recognition tasks.
A configurable visual question and answer dataset (COG) to parallel experiments in humans and animals. COG is much simpler than the general problem of video analysis, yet it addresses many of the problems relating to visual and logical reasoning and memory -- problems that remain challenging for modern deep learning architectures.
The dataset was captured with a DAVIS camera that concurrently streams both dynamic vision sensor (DVS) brightness change events and active pixel sensor (APS) intensity frames. DDD20 is the longest event camera end-to-end driving dataset to date with 51h of DAVIS event+frame camera and vehicle human control data collected from 4000km of highway and urban driving under a variety of lighting conditions.
Composes sentence pairs (i.e., twin sentences).
Contains over 11000 images of 1000 identities with different types of disguise accessories. The dataset is collected from the Internet, resulting in unconstrained face images similar to real world settings.
Includes egocentric videos containing hands in the wild.
FocusPath is a dataset compiled from diverse Whole Slide Image (WSI) scans in different focus (z-) levels. Images are naturally blurred by out-of-focus lens provided with GT scores of focus levels. The dataset can be used for No-Reference Focus Quality assessment of Digital Pathology/Microscopy images.
A challenging multi-agent seasonal dataset collected by a fleet of Ford autonomous vehicles at different days and times during 2017-18.
Supports new task that predicts future locations of people observed in first-person videos.
Grocery Store is a dataset of natural images of grocery items. All natural images were taken with a smartphone camera in different grocery stores. It contains 5,125 natural images from 81 different classes of fruits, vegetables, and carton items (e.g. juice, milk, yoghurt). The 81 classes are divided into 42 coarse-grained classes, where e.g. the fine-grained classes 'Royal Gala' and 'Granny Smith' belong to the same coarse-grained class 'Apple'. Additionally, each fine-grained class has an associated iconic image and a product description of the item.
A large-scale corpus of Gulf Arabic consisting of 110 million words from 1,200 forum novels.
The HandNet dataset contains depth images of 10 participants' hands non-rigidly deforming in front of a RealSense RGB-D camera. The annotations are generated by a magnetic annotation technique. 6D pose is available for the center of the hand as well as the five fingertips (i.e. position and orientation of each).
Hindi Visual Genome is a multimodal dataset consisting of text and images suitable for English-Hindi multimodal machine translation task and multimodal research.
The Hotels-50K dataset consists of over 1 million images from 50,000 different hotels around the world. These images come from both travel websites, as well as the TraffickCam mobile application, which allows every day travelers to submit images of their hotel room in order to help combat trafficking. The TraffickCam images are more visually similar to images from trafficking investigations than the images from travel websites.