383 machine learning datasets
383 dataset results
Minecraft House is a crowd sourced dataset that collects examples of humans building houses in Minecraft. Each user is asked to build a CraftAssist: A Framework for Dialogue-enabled Interactive Agents house on a fixed time budget (30 minutes), without any additional guidance or instructions. Every action of the user is recorded using the Cuberite server.
ObjectNet3D is a large scale database for 3D object recognition, named, that consists of 100 categories, 90,127 images, 201,888 objects in these images and 44,147 3D shapes. Objects in the images in the database are aligned with the 3D shapes, and the alignment provides both accurate 3D pose annotation and the closest 3D shape annotation for each 2D object. Consequently, the database is useful for recognizing the 3D pose and 3D shape of objects from 2D images. Authors also provide baseline experiments on four tasks: region proposal generation, 2D object detection, joint 2D detection and 3D object pose estimation, and image-based 3D shape retrieval, which can serve as baselines for future research.
Automated leaf segmentation is a challenging area in computer vision. Recent advances in machine learning approaches allowed to achieve better results than traditional image processing techniques; however, training such systems often require large annotated data sets. To contribute with annotated data sets and help to overcome this bottleneck in plant phenotyping research, here we provide a novel photometric stereo (PS) data set with annotated leaf masks. This data set forms part of the work done in the BBSRC Tools and Resources Development project BB/N02334X/1.
The platelet-em dataset contains two 3D scanning electron microscope (EM) images of human platelets, as well as instance and semantic segmentations of those two image volumes. This data has been reviewed by NIBIB, contains no PII or PHI, and is cleared for public release. All files use a multipage uint16 TIF format. A 3D image with size [Z, X, Y] is saved as Z pages of size [X, Y]. Image voxels are approximately 40x10x10 nm
This collection contains data and code associated with the IPCAI/IJCARS 2020 paper “Automatic Annotation of Hip Anatomy in Fluoroscopy for Robust and Efficient 2D/3D Registration.” The data hosted here consists of annotated datasets of actual hip fluoroscopy, CT and derived data from six lower torso cadaveric specimens. Documentation and examples for using the dataset and Python code for training and testing the proposed models are also included. Higher-level information, including clinical motivations, prior works, algorithmic details, applications to 2D/3D registration, and experimental details, may be found in the companion paper which is available at https://arxiv.org/abs/1911.07042 or https://doi.org/10.1007/s11548-020-02162-7. We hope that this code and data will be useful in the development of new computer-assisted capabilities that leverage fluoroscopy.
EPISURG is a clinical dataset of $T_1$-weighted magnetic resonance images (MRI) from 430 epileptic patients who underwent resective brain surgery at the National Hospital of Neurology and Neurosurgery (Queen Square, London, United Kingdom) between 1990 and 2018.
The RISE (Robust Indoor Localization in Complex Scenarios) dataset is meant to train and evaluate visual indoor place recognizers. It contains more than 1 million geo-referenced images spread over 30 sequences, covering 5 heterogeneous buildings. For each building we provide: - A high resolution 3D point cloud (1cm) that defines the localization reference frame and that was generated with a mobile laser scanner and an inertial system. - Several image sequences spread over time with accurate ground truth poses retrieved by the laser scanner. Each sequence contains both, stereo pairs and spherical images. - Geo-referenced smartphone data, retrieved from the standard sensors of such devices.
This mouse cerebellar atlas can be used for mouse cerebellar morphometry.
The full IFCNet dataset currently consists of 19,000 CAD models distributed over 65 classes according to the taxonomy of the Industry Foundation Classes (IFC) standard. The IFC standard provides an open data exchange format for projects in the Architecture, Engineering and Construction (AEC) domain. Due to high imbalances with respect to the number of objects in each class, a subset of 8,000 objects from 20 classes is selected to form the IFCNetCore dataset, providing a more balanced distribution. Apart from the geometric information of the CAD model, most objects also have semantic information in the form of key-value pairs, enums or lists, which are relevant to different stages of the construction process.
Large-scale shadows from buildings in a city play an important role in determining the environmental quality of public spaces. They can be both beneficial, such as for pedestrians during summer, and detrimental, by impacting vegetation and by blocking direct sunlight. Determining the effects of shadows requires the accumulation of shadows over time across different periods in a year. In our paper Shadow Accrual Maps: Efficient Accumulation of City-Scale Shadows over Time, we present a simple yet efficient class of approach that uses the properties of sun movement to track the changing position of shadows within a fixed time interval. This repository presents the computed shadow information for New York City, Chicago, Los Angeles, Boston and Washington DC.
Pano3D is a new benchmark for depth estimation from spherical panoramas. Its goal is to drive progress for this task in a consistent and holistic manner. The Pano3D 360 depth estimation benchmark provides a standard Matterport3D train and test split, as well as a secondary GibsonV2 partioning for testing and training as well. The latter is used for zero-shot cross dataset transfer performance assessment and decomposes it into 3 different splits, each one focusing on a specific generalization axis.
The dataset contains patches of facial reflectance as described in the paper, namely the diffuse albedo, diffuse normals, specular albedo, specular normals, as well as the shape in UV space. For the shape, reconstructed meshes have been registered to a common topology and the XYZ values of the points have been mapped to the RGB in UV coordinates and interpolated to complete the UV map. From the complete UV maps of 6144x4096 pixels, patches of 512x512 pixels have been sampled. The dataset contains 7500 such patches (1500 of each datatype) that are anonymized, randomized and sampled so that they do not contain identifiable features.
IKEA Object State Dataset is a new dataset that contains IKEA furniture 3D models, RGBD video of the assembly process, the 6DoF pose of furniture parts and their bounding box.
The goal of the challenge is to compare automated algorithms that are able to detect and segment various types of fluids on a common dataset of optical coherence tomography (OCT) volumes representing different retinal diseases, acquired with devices from different manufacturers. We made available a dataset of OCT volumes containing a wide variety of retinal fluid lesions with accompanying reference annotations. We invite the medical imaging community to participate by developing and testing existing and novel automated retinal OCT segmentation methods.
DRACO20K dataset is used for evaluating object canonicalization on methods that estimate a canonical frame from a monocular input image.
We introduce a new dataset of annotated surveillance videos of freely moving people taken from a distance in both indoor and outdoor scenes. The videos are captured with multiple cameras placed in eight different daily environments. People in the videos undergo large pose variations and are frequently occluded by various environmental factors. Most important, their eyes are mostly not clearly visible as is often the case in surveillance videos. We introduce the first rigorously annotated dataset of 3D gaze directions of freely moving people captured from afar.
This dataset contains charge densities for NMC (Ni, Mn and Co) 2x2x1 supercell (12 transition metal atoms and 12 Li/vacancy site) with varying levels of Li content. For each structure we first randomly sample the number of Mn, Ni and Co atoms given that the total number of transition metal atoms is 12 and then randomly assign them to the transition metal positions of the lattice. Similarly the number of vacancies is uniformly sampled between 0 and 12 and vacancies are assigned to the Li site. The generated configurations are then relaxed in two steps: First we relax the atom positions with fixed cell parameters and then we allow both positions and cell parameters to relax. We keep only the electron density (CHGCAR) file after the last cell relaxation step. The atoms are relaxed until forces on each atom are lower than 0.01 eV/Å.
DifferSketching is a dataset of freehand sketches to understand how differently professional and novice users sketch 3D objects. It includes 3,620 freehand multi-view sketches registered with their corresponding 3D objects. To date, the dataset is an order of magnitude larger than the existing datasets.
PoseScript is a dataset that pairs a few thousand 3D human poses from AMASS with rich human-annotated descriptions of the body parts and their spatial relationships. This dataset is designed for the retrieval of relevant poses from large-scale datasets and synthetic pose generation, both based on a textual pose description.
ReplicaGrasp dataset is created by spawning objects from GRAB into the ReplicaCAD scenes, simulated in random positions and orientations using the Habitat simulator. We capture 4,800 instances, with 50 different objects spawned in one of 48 receptacles in both, upright and randomly fallen orientations.