135 machine learning datasets
135 dataset results
A challenging multi-frame interpolation dataset for autonomous driving scenarios. Based on the principle of hard-sample selection and the diversity of scenarios, NL-Drive dataset contains point cloud sequences with large nonlinear movements from three public large-scale autonomous driving datasets: KITTI, Argoverse and Nuscenes. The overall dataset contains more than 20,000 LiDAR point cloud frames. The frame rate of point cloud sequence is 10Hz. And NL-Drive dataset is split into the training, validation and test set in the ratio of 14:3:3. For the point cloud interpolation task, the point cloud frame input is selected at a given interval of frames, and the remaining point clouds as the ground truth of the interpolation frame. Particularly, each sample of NL-Drive dataset is 4 point cloud frames of 2.5Hz when there are 3 interpolation frames to predict between the middle two input frames.
Intraoral 3D scans analysis is a fundamental aspect of Computer-Aided Dentistry (CAD) systems, playing a crucial role in various dental applications, including teeth segmentation, detection, labeling, and dental landmark identification. Accurate analysis of 3D dental scans is essential for orthodontic and prosthetic treatment planning, as it enables automated processing and reduces the need for manual adjustments by dental professionals. However, developing robust automated tools for these tasks remains a significant challenge due to the limited availability of high-quality public datasets and benchmarks. This article introduces Teeth3DS+, the first comprehensive public benchmark designed to advance the field of intraoral 3D scan analysis. Developed as part of the 3DTeethSeg 2022 and 3DTeethLand 2024 MICCAI challenges, Teeth3DS+ aims to drive research in teeth identification, segmentation, labeling, 3D modeling, and dental landmarks identification. The dataset includes at least 1,800 i
A dataset for position-constrained robot grasp planning.
The BiGe corpus is comprised of 54.360 shots of interest extracted from TED and TEDx talks. All shots are tracked with fully 3d landmarks.
Depth vision has been recently used in many locomotion devices with the objective to ease the life of disabled people toward reaching more ecological lifestyle. This is due to the fact that such cameras are cheap, compact and can provide rich information about the environment. Our dataset provides many recordings of point cloud and other types of data during different locomotion modes in urban context. If you used this data, please cite the following papers below: 1-Depth Vision based Terrain Detection Algorithm during Human Locomotion 2-Using Depth Vision for Terrain Detection during Active Locomotion
A real-world dataset, with hyper-accurate digital counterpart & comprehensive ground-truth annotation.
BASEPROD provides comprehensive rover sensor data collected over a 1.7 km traverse, accompanied by high-resolution 2D and 3D drone maps of the terrain. The dataset also includes laser-induced breakdown spectroscopy (LIBS) measurements from key sampling sites along the rover's path, as well as weather station data to contextualize environmental conditions.
CLAD (Compled and Long Activities Dataset) is an activity dataset which exhibits real-life and diverse scenarios of complex, temporally-extended human activities and actions. The dataset consists of a set of videos of actors performing everyday activities in a natural and unscripted manner. The dataset was recorded using a static Kinect 2 sensor which is commonly used on many robotic platforms. The dataset comprises of RGB-D images, point cloud data, automatically generated skeleton tracks in addition to crowdsourced annotations.
Involves data where a robot interacts with 5.1 cm colored blocks to complete an order-fulfillment style block stacking task. It contains dynamic scenes and real time-series data in a less constrained environment than comparable datasets. There are nearly 12,000 stacking attempts and over 2 million frames of real data.
BigBIRD is a 3D dataset of 125 objects, with the following data for each object:
Near-Collision is a large-scale dataset of 13,658 egocentric video snippets of humans navigating in indoor hallways. In order to obtain ground truth annotations of human pose, the videos are provided with the corresponding 3D point cloud from LIDAR.
The RBO dataset of articulated objects and interactions is a collection of 358 RGB-D video sequences (67:18 minutes) of humans manipulating 14 articulated objects under varying conditions (light, perspective, background, interaction). All sequences are annotated with ground truth of the poses of the rigid parts and the kinematic state of the articulated object (joint states) obtained with a motion capture system. We also provide complete kinematic models of these objects (kinematic structure and three-dimensional textured shape models). In 78 sequences the contact wrenches during the manipulation are also provided.
The dataset contains synthetic training, validation and test data for occupancy grid mapping from lidar point clouds. Additionally, real-world lidar point clouds from a test vehicle with the same lidar setup as the simulated lidar sensor is provided. Point clouds are stored as PCD files and occupancy grid maps are stored as PNG images whereas one image channel describes evidence for a free and another one describes evidence for occupied cell state.
The ARPA-E funded TERRA-REF project is generating open-access reference datasets for the study of plant sensing, genomics, and phenomics. Sensor data were generated by a field scanner sensing platform that captures color, thermal, hyperspectral, and active flourescence imagery as well as three dimensional structure and associated environmental measurements. This dataset is provided alongside data collected using traditional field methods in order to support calibration and validation of algorithms used to extract plot level phenotypes from these datasets.
EUEN17037 Daylight and View Standard Test Dataset.
The Cooperative Driving dataset is a synthetic dataset generated using CARLA that contains lidar data from multiple vehicles navigating simultaneously through a diverse set of driving scenarios. This dataset was created to enable further research in multi-agent perception (cooperative perception) including cooperative 3D object detection, cooperative object tracking, multi-agent SLAM and point cloud registration. Towards that goal, all the frames have been labelled with ground-truth sensor pose and 3D object bounding boxes.
Dataset consist of both real captures from Photoneo PhoXi structured light scanner devices annotated by hand and synthetic samples produced by custom generator. In comparison with existing datasets for 6D pose estimation, some notable differences include:
Electromagnetic (EM) showers simulated dataset. The data contains 16,577 showers. The data includes information about the tracklets: position coordinates, direction and shower id, and about the showers: shower id, initial particle position and direction, shower energy.
To study the data-scarcity mitigation for learning-based visual localization methods via sim-to-real transfer, we curate and now present the CrossLoc benchmark datasets—a multimodal aerial sim-to-real data available for flights above nature and urban terrains. Unlike the previous computer vision datasets focusing on localization in a single domain (mostly real RGB images), the provided benchmark datasets include various multimodal synthetic cues paired to all real photos. Complementary to the paired real and synthetic data, we offer rich synthetic data that efficiently fills the flight envelope volume in the vicinity of the real data.
A cross-city UDA benchmark built upon nuScenes.