383 machine learning datasets
383 dataset results
The NCANDA consortium is composed of an Administrative component at the University of California San Diego, a Data Analysis and Informatics component at SRI International, and five research sites (University of California San Diego, SRI International, Duke University, the University of Pittsburgh, and the Oregon Health & Science University). A sample of 831 individuals (ages 12-21) were recruited for the study across the five research sites. The enrolled participants are followed in an accelerated longitudinal design that involves structural and functional imaging of the brain along with extensive neuropsychological and clinical assessments.
This dataset consists of charge densities of individual snapshots from a molecular dynamics trajectory (DFT simulations?). We insert 8 ethylene carbonate molecules in the simulation box. To quickly explore a large part of the configurational space we put Hookean constraints on the molecular bonds (to maintain molecular identity such that molecules are not torn apart at such high temperature) and run Langevin molecular dynamics with thermostat temperature of 3000 K. The simulation was run for 12380 steps of 0.5 fs.
A Benchmark Dataset for Deep Learning-based Methods for 3D Topology Optimization.
A cross-city UDA benchmark built upon nuScenes.
We present the first fine-grained dataset of 1,497 3D VR sketch and 3D shape pairs for 1,005 chair shapes with large shapes diversity from the ShapeNetCore dataset from 50 participants.
Habitat-Matterport 3D Semantics Dataset (HM3D-Semantics v0.1) is the largest-ever dataset of semantically-annotated 3D indoor spaces. It contains dense semantic annotations for 120 high-resolution 3D scenes from the Habitat-Matterport 3D dataset. The HM3D scenes are annotated with the 1700+ raw object names, which are mapped to 40 Matterport categories. On average, each scene in HM3D-Semantics v0.1 consists of 646 objects from 114 categories.
The dataset comprises 2886 patches in total (2 m GSD), of which 1732 patches for training and 1154 patches for testing. The patch size varies (depending on agricultural parcels) and is on average around 60x60 pixels. Each patch contains 150 contiguous hyperspectral bands (462-942 nm, with a spectral resolution of 3.2 nm), which reflects the spectral range of the hyperspectral imaging sensor deployed on-board Intuition-1.
PaintNet is a dataset for learning robotic spray painting of free-form 3D objects. PaintNet includes more than 800 object meshes and the associated painting strokes collected in a real industrial setting.
Stack of 2D gray images of glass fiber-reinforced polyamide 66 (GF-PA66) 3D X-ray Computed Tomography (XCT) specimen.
Scan Entities in 3D (ScanEnts3D) is a large-scale dataset which provides explicit correspondences between 369k objects across 84k natural referentural sentences, covering 705 real-world scenes.
LiPC (LiDAR Point Cloud Clustering Benchmark Suite) is a benchmark suite for point cloud clustering algorithms based on open-source software and open datasets. It aims to provide the community with a collection of methods and datasets that are easy to use, comparable, and that experimental results are traceable and reproducible.
L-CAS 3D Point Cloud People Dataset contains 28,002 Velodyne scan frames acquired in one of the main buildings (Minerva Building) of the University of Lincoln, UK. Total length of the recorded data is about 49 minutes. Data were grouped into two classes according to whether the robot was stationary or moving.
HPointLoc is a dataset designed for exploring capabilities of visual place recognition in indoor environment and loop detection in simultaneous localization and mapping. It is based on the popular Habitat simulator from 49 photorealistic indoor scenes from the Matterport3D dataset and contains 76,000 frames.
MTNeuro is a multi-task neuroimaging benchmark built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions.
The 'Me 163' was a Second World War fighter airplane and a result of the German air force secret developments. One of these airplanes is currently owned and displayed in the historic aircraft exhibition of the 'Deutsches Museum' in Munich, Germany. To gain insights with respect to its history, design and state of preservation, a complete CT scan was obtained using an industrial XXL-computer tomography scanner at Fraunhofer EZRT .
The Robot Tracking Benchmark (RTB) is a synthetic dataset that facilitates the quantitative evaluation of 3D tracking algorithms for multi-body objects. It was created using the procedural rendering pipeline BlenderProc. The dataset contains photo-realistic sequences with HDRi lighting and physically-based materials. Perfect ground truth annotations for camera and robot trajectories are provided in the BOP format. Many physical effects, such as motion blur, rolling shutter, and camera shaking, are accurately modeled to reflect real-world conditions. For each frame, four depth qualities exist to simulate sensors with different characteristics. While the first quality provides perfect ground truth, the second considers measurements with the distance-dependent noise characteristics of the Azure Kinect time-of-flight sensor. Finally, for the third and fourth quality, two stereo RGB images with and without a pattern from a simulated dot projector were rendered. Depth images were then recons
The data set contains point cloud data captured in an indoor environment with precise localization and ground truth mapping information. Two ”stop-and-go” data sequences of a robot with mounted Ouster OS1-128 lidar are provided. This data-capturing strategy allows recording lidar scans that do not suffer from an error caused by sensor movement. Individual scans from static robot positions are recorded. Additionally, point clouds recorded with the Leica BLK360 scanner are provided as mapping ground-truth data.
The increasing use of deep learning techniques has reduced interpretation time and, ideally, reduced interpreter bias by automatically deriving geological maps from digital outcrop models. However, accurate validation of these automated mapping approaches is a significant challenge due to the subjective nature of geological mapping and the difficulty in collecting quantitative validation data. Additionally, many state-of-the-art deep learning methods are limited to 2D image data, which is insufficient for 3D digital outcrops, such as hyperclouds. To address these challenges, we present Tinto, a multi-sensor benchmark digital outcrop dataset designed to facilitate the development and validation of deep learning approaches for geological mapping, especially for non-structured 3D data like point clouds. Tinto comprises two complementary sets: 1) a real digital outcrop model from Corta Atalaya (Spain), with spectral attributes and ground-truth data, and 2) a synthetic twin that uses latent
A dataset of 100K synthetic images of skin lesions, ground-truth (GT) segmentations of lesions and healthy skin, GT segmentations of seven body parts (head, torso, hips, legs, feet, arms and hands), and GT binary masks of non-skin regions in the texture maps of 215 scans from the 3DBodyTex.v1 dataset [2], [3] created using the framework described in [1]. The dataset is primarily intended to enable the development of skin lesion analysis methods. Synthetic image creation consisted of two main steps. First, skin lesions from the Fitzpatrick 17k dataset were blended onto skin regions of high-resolution three-dimensional human scans from the 3DBodyTex dataset [2], [3]. Second, two-dimensional renders of the modified scans were generated.
3D confocal stacks with corresponding 2D Light-field microscope images