87 machine learning datasets
87 dataset results
CholecT40 is the first endoscopic dataset introduced to enable research on fine-grained action recognition in laparoscopic surgery.
RL Unplugged is suite of benchmarks for offline reinforcement learning. The RL Unplugged is designed around the following considerations: to facilitate ease of use, we provide the datasets with a unified API which makes it easy for the practitioner to work with all data in the suite once a general pipeline has been established. This is a dataset accompanying the paper RL Unplugged: Benchmarks for Offline Reinforcement Learning.
Thumb Index 1000 (TI1K) is a dataset of 1000 hand images with the hand bounding box, and thumb and index fingertip positions. The dataset includes the natural movement of the thumb and index fingers making it suitable for mixed reality (MR) applications.
SuperCaustics is a simulation tool made in Unreal Engine for generating massive computer vision datasets that include transparent objects.
To evaluate the presented approaches, we created the Physical Anomalous Trajectory or Motion (PHANTOM) dataset consisting of six classes featuring everyday objects or physical setups, and showing nine different kinds of anomalies. We designed our classes to evaluate detection of various modes of video abnormalities that are generally excluded in video AD settings.
To provide ground truth supervision for video consistency modeling, we build up a high-quality dynamic OLAT dataset. Our capture system consists of a light stage setup with 114 LED light sources and Phantom Flex4K-GS camera (global shutter, stationary 4K ultra-high-speed camera at 1000 fps), resulting in dynamic OLAT imageset recording at 25 fps using the overlapping method. Our dynamic OLAT dataset provides sufficient semantic, temporal and lighting consistency supervision to train our neural video portrait relighting scheme, which can generalize to in-the-wild scenarios.
A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments.
Fetoscopic Placental Vessel Segmentation and Registration (FetReg2021) challenge was organized as part of the MICCAI2021 Endoscopic Vision (EndoVis) challenge. Through FetReg2021 challenge, we released the first large-scale multi-centre dataset of fetoscopy laser photocoagulation procedure. The dataset contains 2,718 pixel-wise annotated images (for background, vessel, fetus, tool classes) from 24 different in vivo TTTS fetoscopic surgeries and 24 unannotated video clips video clips containing 9,616 frames for training and testing. The dataset is useful for the development of generalized and robust semantic segmentation and video mosaicking algorithms for long duration fetoscopy videos.
PUMaVOS is a dataset of challenging and practical use cases inspired by the movie production industry.
Video sequences captured at a field on Campus Kleinaltendorf (CKA), University of Bonn, captured by BonBot-I, an autonomous weeding robot. The data was captured by mounting an Intel RealSense D435i sensor with a nadir view of the ground.
3DYoga90 is organized within a three-level label hierarchy. It stands out as one of the most comprehensive open datasets, featuring the largest collection of RGB videos and 3D skeleton sequences among publicly available resources.
The Lund University Vision, Radio, and Audio (LuViRA) positioning dataset consists of 89 trajectories that are recorded in the Lund University Humanities Lab's Motion Capture (Mocap) Studio using a MIR200 robot as the targeted platform. Each trajectory contains data from four different systems, vision, radio, audio and a ground truth system that can provide within 0.5mm localization accuracy. A Motion Capture (Mocap) system in the environment is used as the ground truth system, which provides 3D or 6DoF tracking of a camera, a single antenna and a speaker. These targets are mounted on top of the MIR200 robot and put in motion. 3D positions of the 11 static microphones are also provided.
A real-world dataset, with hyper-accurate digital counterpart & comprehensive ground-truth annotation.
The Robot House Multi-View dataset (RHM) contains four views: Front, Back, Ceiling, and Robot Views. There are 14 classes with 6701 video clips for each view, making a total of 26804 video clips for the four views. The lengths of the video clips are between 1 to 5 seconds. The videos with the same number and the same classes are synchronized in different views.
Understanding and analyzing animal behavior is increasingly essential to protect endangered animal species. However, the application of advanced computer vision techniques in this regard is minimal, which boils down to lacking large and diverse datasets for training deep models.
Involves data where a robot interacts with 5.1 cm colored blocks to complete an order-fulfillment style block stacking task. It contains dynamic scenes and real time-series data in a less constrained environment than comparable datasets. There are nearly 12,000 stacking attempts and over 2 million frames of real data.
The SoccerTrack dataset comprises top-view and wide-view video footage annotated with bounding boxes. GNSS coordinates of each player are also provided. We hope that the SoccerTrack dataset will help advance the state of the art in multi-object tracking, especially in team sports.
In this dataset two robots, Baxter and UR5, perform 8 behaviors (look, grasp, pick, hold, shake, lower, drop, and push) on 95 objects that vary by 5 color (blue, green, red, white, and yellow), 6 contents (wooden button, plastic dices, glass marbles, nuts & bolts, pasta, and rice), and 4 weights (empty, 50g, 100g, and 150g). There are 90 objects with contents (5 colors x 3 weights x 6 contents) and 5 objects without any content that only vary by 5 colors. Both robots perform 5 trials on each object, resulting in 7,600 interactions (2 robots x 8 behaviors x 95 objects x 5 trials
We provide separate training, development and test data. The training data is available right away. The development and test data will be released in several stages, starting with a release of the development sources only.
ConsInv is a stereo RGB + IMU dataset designed for Dynamic SLAM testing and contains two subsets: