Datasets

19,997 machine learning datasets

19,997 dataset results

MGif

MGif is a dataset of videos containing movements of different cartoon animals. Each video is a moving gif file. The dataset consists of 1000 videos. The dataset is particularly challenging because of the high appearance variation and motion diversity.

14 papers2 benchmarksVideos

SUMMIT

SUMMIT is a high-fidelity simulator that facilitates the development and testing of crowd-driving algorithms. By leveraging the open-source OpenStreetMap map database and a heterogeneous multi-agent motion prediction model developed in our earlier work, SUMMIT simulates dense, unregulated urban traffic for heterogeneous agents at any worldwide locations that OpenStreetMap supports. SUMMIT is built as an extension of CARLA and inherits from it the physical and visual realism for autonomous driving simulation. SUMMIT supports a wide range of applications, including perception, vehicle control, planning, and end-to-end learning.

14 papers0 benchmarksEnvironment

DeepStab

DeepStab is a dataset for online video stabilization consisting of synchronized steady/unsteady video pairs collected via a well designed hand-held hardware.

14 papers0 benchmarksVideos

PPR10K (Portrait Photo Retouching dataset)

PPR10K is a dataset for portrait photo retouching (PPR), which aims to enhance the visual quality of a collection of flat-looking portrait photos. The Portrait Photo Retouching dataset (PPR10K) is a large-scale and diverse dataset that contains:

14 papers0 benchmarksImages

DIBCO and H_DIBCO ((Handwritten) Document Image Binarization Competition (DIBCO))

The contest of binarization using a popular document database was organized called as Document Image Binarization Contest (DIBCO) from 2009 to 2019, except for 2015.

14 papers0 benchmarks

TransNAS-Bench-101

TransNAS-Bench-101 is a Neural Architecture Search (NAS) benchmark dataset containing network performance across seven tasks, covering classification, regression, pixel-level prediction, and self-supervised tasks. This diversity provides opportunities to transfer NAS methods among tasks and allows for more complex transfer schemes to evolve. We explore two fundamentally different types of search space: cell-level search space and macro-level search space. With 7,352 backbones evaluated on seven tasks, 51,464 trained models with detailed training information are provided. With TransNAS-Bench-101, we hope to encourage the advent of exceptional NAS algorithms that raise cross-task search efficiency and generalizability to the next level.

14 papers0 benchmarksImages

2021 Hotel-ID

2021 Hotel-ID is a dataset for hotel recognition to help raise awareness of human trafficking and generate novel approaches. The dataset consists of hotel room images that have been crowd-sourced and uploaded through the TraffickCam mobile application.

14 papers0 benchmarksImages

KUAKE-QIC (Query Intent Classification Dataset)

KUAKE Query Intent Classification, a dataset for intent classification, is used for the KUAKE-QIC task. Given the queries of search engines, the task requires to classify each of them into one of 11 medical intent categories defined in KUAKE-QIC, including diagnosis, etiology analysis, treatment plan, medical advice, test result analysis, disease description, consequence prediction, precautions, intended effects, treatment fees, and others.

14 papers1 benchmarksTexts

IMC PhotoTourism (Image Matching Challenge Phototourism)

Dataset provided by the Image Matching Workshop

14 papers1 benchmarksImages

DadaGP

DadaGP is a new symbolic music dataset comprising 26,181 song scores in the GuitarPro format covering 739 musical genres, along with an accompanying tokenized format well-suited for generative sequence models such as the Transformer. The tokenized format is inspired by event-based MIDI encodings, often used in symbolic music generation models. The dataset is released with an encoder/decoder which converts GuitarPro files to tokens and back.

14 papers0 benchmarksMidi

HiFiMask (CASIA-SURF HiFiMask)

HiFiMask is a large-scale High-Fidelity Mask dataset, namely CASIA-SURF HiFiMask (briefly HiFiMask). It contains a total amount of 54,600 videos are recorded from 75 subjects with 225 realistic masks by 7 new kinds of sensors.

14 papers0 benchmarksImages, Videos

mvor (Multi-View Operating Room)

Multi-View Operating Room (MVOR) dataset consists of 732 synchronized multi-view frames recorded by three RGB-D cameras in a hybrid OR during real clinical interventions. Each multi-view frame consists of three color and three depth images. The MVOR dataset was sampled from four days of recording in an interventional room at the University Hospital of Strasbourg during procedures such as vertebroplasty and lung biopsy. There are in total 4699 bounding boxes, 2926 2D keypoint annotations, and 1061 3D keypoint annotations.

14 papers0 benchmarks

BioLAMA

BioLAMA is a benchmark comprised of 49K biomedical factual knowledge triples for probing biomedical Language Models. It is used to assess the capabilities of Language Models for being valid biomedical knowledge bases.

14 papers0 benchmarksBiomedical

ObjectFolder

ObjectFolder is a dataset for multisensory object-centric perception, reasoning, and interaction. It consists of 100 virtualized objects. ObjectFolder encodes the visual, auditory, and tactile sensory data for all objects, enabling a number of multisensory object recognition tasks.

14 papers0 benchmarks

PASS (Pictures without humAns for Self-Supervision)

PASS is a large-scale image dataset, containing 1.4 million images, that does not include any humans and which can be used for high-quality pretraining while significantly reducing privacy concerns.

14 papers0 benchmarksImages

GraphQuestions

GraphQuestions is a characteristic-rich dataset designed for factoid question answering. The dataset aims to provide a systematic way of constructing QA datasets with rich and explicitly specified question characteristics. Here are some key details about GraphQuestions:

14 papers2 benchmarks

IIIT5k

The IIIT5K dataset contains 5,000 text instance images: 2,000 for training and 3,000 for testing. It contains words from street scenes and from originally-digital images. Every image is associated with a 50 -word lexicon and a 1,000 -word lexicon.

14 papers3 benchmarksImages

SVTP

SVTP dataset stands for Scene Text Recognition Datasets. It is a collection of 4 popular Latin/English scene text recognition datasets, namely IIIT5K, SVT, SVTP, and CUTE-80. These datasets only provide case-insensitive annotations and no punctuation marks.

14 papers3 benchmarks

DuLeMon (Baidu Long-term Memory Conversation)

DuLeMon is a large-scale Chinese Long-term Memory Conversation dataset, which simulates long-term memory conversations and focuses on the ability to actively construct and utilize the user's and the bot's persona in a long-term interaction. DuLeMon contains about 27.5k human-human conversations, 449k utterances, and 12k persona grounding sentences. This corpus can be used to explore Long-term Memory Conversation, Personalized Dialogue, and Persona Extraction / Matching / Retrieval.

14 papers0 benchmarksTexts

Amazon Men

This datasets is a subset of the Amazon reviews dataset which contain Men related products

14 papers3 benchmarks

PreviousPage 128 of 1000Next