135 machine learning datasets
135 dataset results
We present the first fine-grained dataset of 1,497 3D VR sketch and 3D shape pairs for 1,005 chair shapes with large shapes diversity from the ShapeNetCore dataset from 50 participants.
Dataset of low fidelity resolutions of the RANS equations over airfoils.
This dataset provides wireless measurements from two industrial testbeds: iV2V (industrial Vehicle-to-Vehicle) and iV2I+ (industrial Vehicular-to-Infrastructure plus sensor).
A simulated dataset built in Unreal Engine 4 with AirSim. Designed for visual point cloud change detection. Including GT point clouds before changes and after changes. Besides, 4 trajectories with stereo camera and IMU data are recorded for change detection task.
The data set contains point cloud data captured in an indoor environment with precise localization and ground truth mapping information. Two ”stop-and-go” data sequences of a robot with mounted Ouster OS1-128 lidar are provided. This data-capturing strategy allows recording lidar scans that do not suffer from an error caused by sensor movement. Individual scans from static robot positions are recorded. Additionally, point clouds recorded with the Leica BLK360 scanner are provided as mapping ground-truth data.
The increasing use of deep learning techniques has reduced interpretation time and, ideally, reduced interpreter bias by automatically deriving geological maps from digital outcrop models. However, accurate validation of these automated mapping approaches is a significant challenge due to the subjective nature of geological mapping and the difficulty in collecting quantitative validation data. Additionally, many state-of-the-art deep learning methods are limited to 2D image data, which is insufficient for 3D digital outcrops, such as hyperclouds. To address these challenges, we present Tinto, a multi-sensor benchmark digital outcrop dataset designed to facilitate the development and validation of deep learning approaches for geological mapping, especially for non-structured 3D data like point clouds. Tinto comprises two complementary sets: 1) a real digital outcrop model from Corta Atalaya (Spain), with spectral attributes and ground-truth data, and 2) a synthetic twin that uses latent
Overview: This dataset encompasses a compilation of 6,700 executed scoops (excavations), mapped across a vast spectrum of materials, terrain topography, and compositions.
Robot@Home2, is an enhanced version aimed at improving usability and functionality for developing and testing mobile robotics and computer vision algorithms. Robot@Home2 consists of three main components. Firstly, a relational database that states the contextual information and data links, compatible with Standard Query Language. Secondly,a Python package for managing the database, including downloading, querying, and interfacing functions. Finally, learning resources in the form of Jupyter notebooks, runnable locally or on the Google Colab platform, enabling users to explore the dataset without local installations. These freely available tools are expected to enhance the ease of exploiting the Robot@Home dataset and accelerate research in computer vision and robotics.
UAV Laser Scanning data collected over neotropical forest (Paracou French Guiana). Four flights conducted over one ha plot in 2021 and 2022.
From https://github.com/MMintLab/VIRDO/blob/master/data/dataset_readme.txt,
Connectivity is a main driver for the ongoing megatrend of automated mobility: future Cooperative Intelligent Transport Systems (C-ITS) will connect road vehicles, traffic signals, roadside infrastructure, and even vulnerable road users, sharing data and compute for safer, more efficient, and more comfortable mobility. In terms of communication technology for realizing such vehicle-to-everything (V2X) communication, the WLAN-based peer-to-peer approach (IEEE 802.11p, ITS-G5 in Europe) competes with C-V2X based on cellular technologies (4G and beyond). Irrespective of the underlying communication standard, common message interfaces are crucial for a common understanding between vehicles, especially from different manufacturers. Targeting this issue, the European Telecommunications Standards Institute (ETSI) has been standardizing V2X message formats such as the Cooperative Awareness Message (CAM). In this work, we present V2AIX, a multi-modal real-world dataset of ETSI ITS messages gath
ECLAIR (Extended Classification of Lidar for AI Recognition), a new outdoor large-scale aerial LiDAR dataset designed specifically for advancing research in point cloud semantic segmentation. As the most extensive and diverse collection of its kind to date, the dataset covers a total area of 10km2 with close to 600 million points and features eleven distinct object categories. To guarantee the dataset's quality and utility, we have thoroughly curated the point labels through an internal team of experts, ensuring accuracy and consistency in semantic labeling. The dataset is engineered to move forward the fields of 3D urban modeling, scene understanding, and utility infrastructure management by presenting new challenges and potential applications.
A large-scale, egocentric, multimodal, and context-aware dataset of human demonstrations of social navigation.
Heritage Pointcloud Instance Collection dataset, acquired from two large buildings and annotated at a point-wise semantic level based on existent BIM models. Devid Campagnolo, Elena Camuffo, Umberto Michieli, Paolo Borin, Simone Milani and Andrea Giordano, "Fully Automated Scan-to-BIM via Point Cloud Instance Segmentation", In Proceedings of the International Conference on Image Processing (ICIP) 2023.
This dataset is well-structured for the physics-informed training of Neural operators for irregular domain geometry, which provides the FEM results of solving a darcy problem in a domain geometry shape of a pentagram. The Github of the paper that first use this dataset is: https://github.com/WeihengZ/PI-DCON.
ConSLAM is a real-world dataset collected periodically on a construction site to measure the accuracy of mobile scanners' SLAM algorithms.
This is the official dataset collected for to test the sim-to-real transfer. It contains 6 articulated object instances, each captured from 20 camera views under 5 states in scenarios with and without background, as well as presence or absence of distractors.
A small-scale, real-world Project Aria dataset with high quality static 3D oriented bounding boxs annotations.
Two versions of the dataset are offered: one is the full dataset used to train the models in DeformPAM, and the other is a mini dataset for easier examination. Both datasets include data for the supervised and finetuning stages of granular pile shaping, rope shaping, and T-shirt unfolding.
The SimBEV dataset is a collection of 320 scenes spread across all 11 CARLA maps and contains data from a variety of sensors, including five camera types (RGB, semantic segmentation, instance segmentation, depth, and optical flow), lidar, semantic lidar, radar, GNSS, and IMU, along with 3D object bounding boxes and accurate bird's-eye view (BEV) ground truth. With each scene lasting 16 seconds at a frame rate of 20 Hz, the SimBEV dataset contains 102,400 annotated frames, over 8 million 3D object bounding boxes, and more than 2.5 billion BEV ground truth labels.