19,997 machine learning datasets
19,997 dataset results
EDEN (Enclosed garDEN) is a multimodal synthetic dataset, a dataset for nature-oriented applications. The dataset features more than 300K images captured from more than 100 garden models. Each image is annotated with various low/high-level vision modalities, including semantic segmentation, depth, surface normals, intrinsic colors, and optical flow.
The VideoNavQA dataset contains pairs of questions and videos generated in the House3D environment. The goal of this dataset is to assess question-answering performance from nearly-ideal navigation paths, while considering a much more complete variety of questions than current instantiations of the Embodied Question Answering (EQA) task.
QUVA Repetition dataset consists of 100 videos displaying a wide variety of repetitive video dynamics, including swimming, stirring, cutting, combing and music-making. All videos have been annotated with individual cycle bounds and a total repetition count.
NeuralNews is a dataset for machine-generated news detection. It consists of human-generated and machine-generated articles. The human-generated articles are extracted from the GoodNews dataset, which is extracted from the New York Times. It contains 4 types of articles:
Provides a large-scale synthetic dataset which contains accurate ground truth depth of various photo-realistic scenes.
AeroRIT is a hyperspectral dataset to facilitate aerial hyperspectral scene understanding.
ArCOV19-Rumors is an Arabic COVID-19 Twitter dataset for misinformation detection composed of tweets containing claims from 27th January till the end of April 2020.
This self-driving dataset collected in Brno, Czech Republic contains data from four WUXGA cameras, two 3D LiDARs, inertial measurement unit, infrared camera and especially differential RTK GNSS receiver with centimetre accuracy.
Chinese AI and Law 2019 Similar Case Matching dataset. CAIL2019-SCM contains 8,964 triplets of cases published by the Supreme People's Court of China. CAIL2019-SCM focuses on detecting similar cases, and the participants are required to check which two cases are more similar in the triplets.
COMP6 is a benchmark for evaluating the extensibility of machine-learning based molecular potentials. It contains a diverse set of organic molecules.
CompGuessWhat?! extends the original GuessWhat?! datasets with a rich semantic representations in the form of scene graphs associated with every image used as reference scene for the guessing games.
The ContentWise Impressions dataset is a collection of implicit interactions and impressions of movies and TV series from an Over-The-Top media service, which delivers its media contents over the Internet. The dataset is distinguished from other already available multimedia recommendation datasets by the availability of impressions, i.e., the recommendations shown to the user, its size, and by being open-source. The items in the dataset represent the multimedia content that the service provided to the users and are represented by an anonymized numerical identifier. The items refer to television and cinema products belonging to four mutually exclusive categories: movies, movies and clips in series, TV movies or shows, and episodes of TV series. The interactions represent the actions performed by users on items in the service and are associated with the timestamp when it occurred. Interactions contain the identifier of the impressions, except in those cases where the recommendations came
The TUB CrowdFlow is a synthetic dataset that contains 10 sequences showing 5 scenes. Each scene is rendered twice: with a static point of view and a dynamic camera to simulate drone/UAV based surveillance. The scenes are render using Unreal Engine at HD resolution (1280x720) at 25 fps, which is typical for current commercial CCTV surveillance systems. The total number of frames is 3200.
CSPubSum is a dataset for summarisation of computer science publications, created by exploiting a large resource of author provided summaries and show straightforward ways of extending it further.
A large-scale anime image database with 4.2m+ images annotated with 130m+ text tags describing image contents in detail; it can be useful for machine learning purposes such as image recognition and generation. It has been applied to a wide variety of applications, particularly generative modeling.
The DDI-100 dataset is a synthetic dataset for text detection and recognition based on 7000 real unique document pages and consists of more than 100000 augmented images. The ground truth comprises text and stamp masks, text and characters bounding boxes with relevant annotations.
A new English-French test set for the evaluation of Machine Translation (MT) for informal, written bilingual dialogue. The test set contains 144 spontaneous dialogues (5,700+ sentences) between native English and French speakers, mediated by one of two neural MT systems in a range of role-play settings. The dialogues are accompanied by fine-grained sentence-level judgments of MT quality, produced by the dialogue participants themselves, as well as by manually normalised versions and reference translations produced a posteriori.
dMelodies is dataset of simple 2-bar melodies generated using 9 independent latent factors of variation where each data point represents a unique melody based on the following constraints: - Each melody will correspond to a unique scale (major, minor, blues, etc.). - Each melody plays the arpeggios using the standard I-IV-V-I cadence chord pattern. - Bar 1 plays the first 2 chords (6 notes), Bar 2 plays the second 2 chords (6 notes). - Each played note is an 8th note.
The Dataset of Multimodal Semantic Egocentric Video (DoMSEV) contains 80-hours of multimodal (RGB-D, IMU, and GPS) data related to First-Person Videos with annotations for recorder profile, frame scene, activities, interaction, and attention.
This dataset provides a large number of training and testing example which is sufficient for a deep learning approach to address Dunhuang Grotto Painting restoration.