19,997 machine learning datasets
19,997 dataset results
The QMUL underGround Re-IDentification (GRID) dataset contains 250 pedestrian image pairs. Each pair contains two images of the same individual seen from different camera views. All images are captured from 8 disjoint camera views installed in a busy underground station. The figures beside show a snapshot of each of the camera views of the station and sample images in the dataset. The dataset is challenging due to variations of pose, colours, lighting changes; as well as poor image quality caused by low spatial resolution.
The USF Human ID Gait Challenge Dataset is a dataset of videos for gait recognition. It has videos from 122 subjects in up to 32 possible combinations of variations in factors.
CHALET is a 3D house simulator with support for navigation and manipulation. Unlike existing systems, CHALET supports both a wide range of object manipulation, as well as supporting complex environemnt layouts consisting of multiple rooms. The range of object manipulations includes the ability to pick up and place objects, toggle the state of objects like taps or televesions, open or close containers, and insert or remove objects from these containers. In addition, the simulator comes with 58 rooms that can be combined to create houses, including 10 default house layouts. CHALET is therefore suitable for setting up challenging environments for various AI tasks that require complex language understanding and planning, such as navigation, manipulation, instruction following, and interactive question answering.
Griddly is an environment for grid-world based research. Griddly provides a highly optimized game state and rendering engine with a flexible high-level interface for configuring environments. Not only does Griddly offer simple interfaces for single, multi-player and RTS games, but also multiple methods of rendering, configurable partial observability and interfaces for procedural content generation.
DCASE2018 Task 4 is a dataset for large-scale weakly labeled semi-supervised sound event detection in domestic environments. The data are YouTube video excerpts focusing on domestic context which could be used for example in ambient assisted living applications. The domain was chosen due to the scientific challenges (wide variety of sounds, time-localized events...) and potential industrial applications. Specifically, the task employs a subset of “Audioset: An Ontology And Human-Labeled Dataset For Audio Events” by Google. Audioset consists of an expanding ontology of 632 sound event classes and a collection of 2 million human-labeled 10-second sound clips (less than 21% are shorter than 10-seconds) drawn from 2 million Youtube videos. The ontology is specified as a hierarchical graph of event categories, covering a wide range of human and animal sounds, musical instruments and genres, and common everyday environmental sounds. Task 4 focuses on a subset of Audioset that consists of 10
Rico is a public UI corpus with 72K Android UI screens mined from 9.7K Android apps (Deka et al., 2017). Each screen in Rico comes with a screenshot image and a view hierarchy of a collection of UI objects. Authors manually removed screens whose view hierarchies do not match their screenshots by asking annotators to visually verify whether the bounding boxes of view hierarchy leaves match each UI object on the corresponding screenshot image. This filtering results in 25K unique screens.
The Discovery datasets consists of adjacent sentence pairs (s1,s2) with a discourse marker (y) that occurred at the beginning of s2. They were extracted from the depcc web corpus.
CLEVR-Dialog is a large diagnostic dataset for studying multi-round reasoning in visual dialog. Specifically, that authors construct a dialog grammar that is grounded in the scene graphs of the images from the CLEVR dataset. This combination results in a dataset where all aspects of the visual dialog are fully annotated. In total, CLEVR-Dialog contains 5 instances of 10-round dialogs for about 85k CLEVR images, totaling to 4.25M question-answer pairs.
This dataset was created with images provided by the United States National Archive and FamilySearch.
Chart2Text is a dataset that was crawled from 23,382 freely accessible pages from statista.com in early March of 2020, yielding a total of 8,305 charts, and associated summaries. For each chart, the chart image, the underlying data table, the title, the axis labels, and a human-written summary describing the statistic was downloaded.
2-PM Vessel is an open-source volumetric brain vasculature dataset obtained with two-photon microscopy at Focused Ultrasound Lab, at Sunnybrook Research Institute (affiliated with University of Toronto by Dr. Alison Burgess, Charissa Poon and Marc Santos. The dataset contains a total of 12 volumetric stacks consisting of images of mouse brain vasculature and tumour vasculature.
Contains densely labeled speech activity in YouTube videos, with the goal of creating a shared, available dataset for this task.
An expert-annotated word similarity dataset which provides a highly reliable, yet challenging, benchmark for rare word representation techniques.
Consist of 23,533 statements extracted from all U.S. general election presidential debates and annotated by human coders. The ClaimBuster dataset can be leveraged in building computational methods to identify claims that are worth fact-checking from the myriad of sources of digital or traditional media.
A large-scale curated dataset of over 152 million tweets, growing daily, related to COVID-19 chatter generated from January 1st to April 4th at the time of writing.
Curates a large pelvic CT dataset pooled from multiple sources and different manufacturers, including 1, 184 CT volumes and over 320, 000 slices with different resolutions and a variety of the above-mentioned appearance variations.
CubiCasa5K is a large-scale floorplan image dataset containing 5000 samples annotated into over 80 floorplan object categories. The dataset annotations are performed in a dense and versatile manner by using polygons for separating the different objects.
DAWN emphasizes a diverse traffic environment (urban, highway and freeway) as well as a rich variety of traffic flow. The DAWN dataset comprises a collection of 1000 images from real-traffic environments, which are divided into four sets of weather conditions: fog, snow, rain and sandstorms. The dataset is annotated with object bounding boxes for autonomous driving and video surveillance scenarios. This data helps interpreting effects caused by the adverse weather conditions on the performance of vehicle detection systems.
DeepScores contains high quality images of musical scores, partitioned into 300,000 sheets of written music that contain symbols of different shapes and sizes. For advancing the state-of-the-art in small objects recognition, and by placing the question of object recognition in the context of scene understanding.
A new dataset of handwritten text with fine-grained annotations at the character level and report results from an initial user evaluation.