383 machine learning datasets
383 dataset results
CMD is a publicly available collection of hundreds of thousands 2D maps and 3D grids containing different properties of the gas, dark matter, and stars from more than 2,000 different universes. The data has been generated from thousands of state-of-the-art (magneto-)hydrodynamic and gravity-only N-body simulations from the CAMELS project.
QDax is a benchmark suite designed for for Deep Neuroevolution in Reinforcement Learning domains for robot control. The suite includes the definition of tasks, environments, behavioral descriptors, and fitness. It specify different benchmarks based on the complexity of both the task and the agent controlled by a deep neural network. The benchmark uses standard Quality-Diversity metrics, including coverage, QD-score, maximum fitness, and an archive profile metric to quantify the relation between coverage and fitness.
Generate high-quality 3D ground-truth shapes for reconstruction evaluation is extremely challenging because even 3D scanners can only generate pseudo ground-truth shapes with artefacts. We propose a novel data capturing and 3D annotation pipeline to obtain precise 3D ground-truth shapes without relying on expensive 3D scanners. The key to creating the precise 3D ground-truth shapes is using LEGO models, which are made of LEGO bricks with known geometry. The MobileBrick dataset provides a unique opportunity for future research on high-quality 3D reconstruction thanks to two distinctive features: 1) A large number of RGBD sequences with precise 3D ground-truth annotations. 2) The RGBD images were captured using mobile devices so algorithms can be tested in a realistic setup for mobile AR applications.
MedShapeNet contains over 100,000 medical shapes, including bones, organs, vessels, muscles, etc., as well as surgical instruments. You can search, display them in 3D and download the individual shapes by using our shape search engine. Note that MedShapeNet is provided for research and educational purposes only.
The RAD-ChestCT dataset is a large medical imaging dataset developed by Duke MD/PhD Rachel Draelos during her Computer Science PhD supervised by Lawrence Carin. The full dataset includes 35,747 chest CT scans from 19,661 adult patients. The public Zenodo repository contains an initial release of 3,630 chest CT scans, approximately 10% of the dataset. This dataset is of significant interest to the machine learning and medical imaging research communities.
Enhancing the robustness of vision algorithms in real-world scenarios is challenging. One reason is that existing robustness benchmarks are limited, as they either rely on synthetic data or ignore the effects of individual nuisance factors. We introduce OOD-CV, a benchmark dataset that includes out-of-distribution examples of 10 object categories in terms of pose, shape, texture, context and the weather conditions, and enables benchmarking models for image classification, object detection, and 3D pose estimation. In addition to this novel dataset, we contribute extensive experiments using popular baseline methods, which reveal that: 1. Some nuisance factors have a much stronger negative effect on the performance compared to others, also depending on the vision task. 2. Current approaches to enhance robustness have only marginal effects, and can even reduce robustness. 3. We do not observe significant differences between convolutional and transformer architectures. We believe our datase
IndustReal is an ego-centric, multi-modal dataset where 27 participants are challenged to perform assembly and maintenance procedures on a construction-toy car. The dataset is annotated for action recognition, assembly state detection, and procedure step recognition. IndustReal includes 38 execution errors in a total of 84 videos, with 14 exclusive to validation and test sets and therefore suitable for testing the robustness of algorithms against unseen errors in procedural tasks. IndustReal offers open-source 3D models for all parts to promote the use of synthetic data for scalable approaches on this dataset, as well as reproducibility. All assembly parts used in the dataset are 3D printed. This ensures reproducibility and future availability of the model and allows for growth via community effort.
PedX is a large-scale multi-modal collection of pedestrians at complex urban intersections. The dataset provides high-resolution stereo images and LiDAR data with manual 2D and automatic 3D annotations. The data was captured using two pairs of stereo cameras and four Velodyne LiDAR sensors.
A Large Dataset of Object Scans is a dataset of more than ten thousand 3D scans of real objects. To create the dataset, the authors recruited 70 operators, equipped them with consumer-grade mobile 3D scanning setups, and paid them to scan objects in their environments. The operators scanned objects of their choosing, outside the laboratory and without direct supervision by computer vision professionals. The result is a large and diverse collection of object scans: from shoes, mugs, and toys to grand pianos, construction vehicles, and large outdoor sculptures. The authors worked with an attorney to ensure that data acquisition did not violate privacy constraints. The acquired data was placed in the public domain and is available freely.
Endoscopic stereo reconstruction for surgical scenes gives rise to specific problems, including the lack of clear corner features, highly specular surface properties, and the presence of blood and smoke. These issues present difficulties for both stereo reconstruction itself and also for standardised dataset production. We present a stereo-endoscopic reconstruction validation dataset based on cone-beam CT (SERV-CT). Two ex vivo small porcine full torso cadavers were placed within the view of the endoscope with both the endoscope and target anatomy visible in the CT scan. Subsequent orientation of the endoscope was manually aligned to match the stereoscopic view and benchmark disparities, depths and occlusions are calculated. The requirement of a CT scan limited the number of stereo pairs to 8 from each ex vivo sample. For the second sample an RGB surface was acquired to aid alignment of smooth, featureless surfaces. Repeated manual alignments showed an RMS disparity accuracy of around
DurLAR is a high-fidelity 128-channel 3D LiDAR dataset with panoramic ambient (near infrared) and reflectivity imagery for multi-modal autonomous driving applications. Compared to existing autonomous driving task datasets, DurLAR has the following novel features:
This is a 16.2-million frame (50-hour) multimodal dataset of two-person face-to-face spontaneous conversations. This dataset features synchronized body and finger motion as well as audio data. It represents the largest motion capture and audio dataset of natural conversations to date. The statistical analysis verifies strong intraperson and interperson covariance of arm, hand, and speech features, potentially enabling new directions on data-driven social behavior analysis, prediction, and synthesis.
aiMotive dataset is a multimodal dataset for robust autonomous driving with long-range perception. The dataset consists of 176 scenes with synchronized and calibrated LiDAR, camera, and radar sensors covering a 360-degree field of view. The collected data was captured in highway, urban, and suburban areas during daytime, night, and rain and is annotated with 3D bounding boxes with consistent identifiers across frames.
FFHQ-UV is a large-scale facial UV-texture dataset that contains over 50,000 high-quality texture UV-maps with even illuminations, neutral expressions, and cleaned facial regions, which are desired characteristics for rendering realistic 3D face models under different lighting conditions. The dataset is derived from FFHQ and preserves the most variations in FFHQ.
HOD is a dataset for 3D object reconstruction which contains 35 objects, divided into two subsets named Sculptures and Daily Objects. The Sculptures has five human sculptures with complex geometries and pure white textures. The Daily Objects consists of 30 daily objects with various shapes and appearances. All of the Sculptures and nine of the Daily Objects are paired with high-fidelity scanned meshes as ground truth geometries for evaluation.
CHI3D is a lab-based accurate 3D motion capture dataset with 631 sequences containing 2,525 contact events,728,664 ground truth 3d poses, as well as FlickrCI3D, a dataset of 11,216 images, with 14,081 processed pairs of people, and 81,233 facet-level surface correspondences.
We present a large-scale dataset for 3D urban scene understanding. Compared to existing datasets, our dataset consists of 75 outdoor urban scenes with diverse backgrounds, encompassing over 15,000 images. These scenes offer 360◦ hemispherical views, capturing diverse foreground objects illuminated under various lighting conditions. Additionally, our dataset encompasses scenes that are not limited to forward-driving views, addressing the limitations of previous datasets such as limited overlap and coverage between camera views. The closest pre-existing dataset for generalizable evaluation is DTU [2] (80 scenes) which comprises mostly indoor objects and does not provide multiple foreground objects or background scenes.
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
Cancer in the region of the head and neck (HaN) is one of the most prominent cancers, for which radiotherapy represents an important treatment modality that aims to deliver a high radiation dose to the targeted cancerous cells while sparing the nearby healthy organs-at-risk (OARs). A precise three-dimensional spatial description, i.e. segmentation, of the target volumes as well as OARs is required for optimal radiation dose distribution calculation, which is primarily performed using computed tomography (CT) images. However, the HaN region contains many OARs that are poorly visible in CT, but better visible in magnetic resonance (MR) images. Although attempts have been made towards the segmentation of OARs from MR images, so far there has been no evaluation of the impact the combined analysis of CT and MR images has on the segmentation of OARs in the HaN region. The Head and Neck Organ-at-Risk Multi-Modal Segmentation Challenge aims to promote the development of new and application of
MatSynth MatSynth is a Physically Based Rendering (PBR) materials dataset designed for modern AI applications. This dataset consists of over 4,000 ultra-high resolution, offering unparalleled scale, diversity, and detail.