1,019 machine learning datasets
1,019 dataset results
CUHK Square data set is for transfer learning research on adapting generic pedestrian detectors. It includes a traffic video sequence of 60 minutes long. It is recorded by a stationary camera. The size of the scene is 720 by 576.
The Grand central station dataset includes a video with 50,010 frames which is used for Scene Understanding and Crowd Analysis.
Dataset of annotated independently moving objects (IMO). This dataset contains left and right images, stereo images, stereo disparity from SGM, and vehicle labels as well as a ground truth annotations.
This is a dataset for vehicle detection. It consists of:
FPV-O is a multi-subject first-person vision dataset of office activities. Office activities include person-to-person interactions, such as chatting and handshaking, person-to-object interactions, such as using a computer or a whiteboard, as well as generic activities such as walking. The videos in the dataset present a number of challenges that, in addition to intra-class differences and inter-class similarities, include frames with illumination changes, motion blur, and lack of texture.
The medaka (Oryzias latipes) and the zebrafish (Danio rerio) are used as a model organism for a variety of subjects in biomedical research. The presented work aims to study the potential of automated ventricular dimension estimation through heart segmentation in medaka. For more on this, it's time for a closer look on our paper and the supplementary materials.
The MOBIO database consists of bi-modal (audio and video) data taken from 152 people. The database has a female-male ratio or nearly 1:2 (100 males and 52 females) and was collected from August 2008 until July 2010 in six different sites from five different countries. This led to a diverse bi-modal database with both native and non-native English speakers.
Toronto NeuroFace Dataset: A New Dataset for Facial Motion Analysis in Individuals with Neurological Disorders
The dataset contains more than 35000 images and 600 videos captured using 35 different portable devices of 11 major brands. In addition to the original acquisitions, images were shared through Facebook and WhatsApp whereas videos were shared through YouTube and WhatsApp platforms.
To study kinship verification from gait, we collected the dataset KinGaitWild consisting of several videos from youtube. Most of the videos were taken under uncontrolled conditions in terms of background, camera motion, luminance and viewpoints. The KinGaitWild dataset contains 105 videos of celebrities and their relatives. The average time duration of each video is around 10 seconds. The database includes 60 pairs of Father-Son (FS) relationships. These pairs are equally split into 5 groups. We focus in this study on the Father-Son relationships. The database collection was done as follows. First, we used the YouTube Data API to search for videos showing celebrities walking in the wild. To avoid biases, we selected the pairs of celebrities so that the videos are not originated from the same source nor environment. For each video, we labeled the position of each specified person by a bounding box (bbox). These bboxes are used to estimate the human pose for silhouette-based approaches a
DAHLIA dataset [1] is devoted to human activity recognition, which is a major issue for adapting smart-home services such as user assistance. DAHLIA has been realized in Mobile Mii Platform by CEA LIST, and has been partly supported by ITEA 3 Emospaces Project (https://itea3.org/project/emospaces.html)
This dataset defines a total of 11 crowd motion patterns and it is composed of over 6000 video sequences with an average length of 100 frames per sequence. This documentation presents how to download and process the Crowd-11 dataset.
InfiniteRep is a synthetic, open-source dataset for fitness and physical therapy (PT) applications. It includes 1k videos of diverse avatars performing multiple repetitions of common exercises. It includes significant variation in the environment, lighting conditions, avatar demographics, and movement trajectories. From cadence to kinematic trajectory, each rep is done slightly differently -- just like real humans. InfiniteRep videos are accompanied by a rich set of pixel-perfect labels and annotations, including frame-specific repetition counts.
The dataset has been designed to represent true web videos in the wild, with good visual quality and diverse content characteristics, The test video collection for TRECVID-AVS2019-TRECVID-AVS2021, which contains 1,082,649 web video clips, with even more diverse content, no predominant characteristics and low self-similarity.
Description: 895 Fire Videos Data,the total duration of videos is 27 hours 6 minutes 48.58 seconds. The dataset adpoted different cameras to shoot fire videos. The shooting time includes day and night.The dataset can be used for tasks such as fire detection.
Infinity AI's Spills Basic Dataset is a synthetic, open-source dataset for safety applications. It features 150 videos of photorealistic liquid spills across 15 common settings. Spills take on in-context reflections, caustics, and depth based on the surrounding environment, lighting, and floor. Each video contains a spill of unique properties (size, color, profile, and more) and is accompanied by pixel-perfect labels and annotations. This dataset can be used to develop computer vision algorithms to detect the location and type of spill from the perspective of a fixed camera.
STVD-FC is the largest public dataset on the political content analysis and fact-checking tasks. It consists of more than 1,200 fact-checked claims that have been scraped from a fact-checking service with associated metadata. For the video counterpart, the dataset contains nearly 6,730 TV programs, having a total duration of 6,540 hours, with metadata. These programs have been collected during the 2022 French presidential election with a dedicated workstation and protocol. The dataset is delivered as different parts for accessibility of the 2 TB of data and proper indexes. More information about the STVD-FC dataset can be found into the publication [1].
This dataset focuses only on the robbery category, presenting a new weakly labelled dataset that contains 486 new real–world robbery surveillance videos acquired from public sources.
Laser powder bed fusion (LBPF) is the additive manufacturing (3D printing) process for metals. RAISE-LPBF is a large dataset on the effect of laser power and laser dot speed in 316L stainless steel bulk material. Both process parameters are independently sampled for each scan line from a continuous distribution, so interactions of different parameter choices can be investigated. Process monitoring comprises on-axis high-speed (20k FPS) video. The data can be used to derive statistical properties of LPBF, as well as to build anomaly detectors.
Human activity recognition and clinical biomechanics are challenging problems in physical telerehabilitation medicine. However, most publicly available datasets on human body movements cannot be used to study both problems in an out-of-the-lab movement acquisition setting. The objective of the VIDIMU dataset is to pave the way towards affordable patient tracking solutions for remote daily life activities recognition and kinematic analysis.