383 machine learning datasets
383 dataset results
A real-world dataset, with hyper-accurate digital counterpart & comprehensive ground-truth annotation.
A high-quality synthetic dataset for object relighting. Covering a wide range of geometry and material.
A high-quality captured dataset for object relighting. Covering a wide range of geometry and material.
BASEPROD provides comprehensive rover sensor data collected over a 1.7 km traverse, accompanied by high-resolution 2D and 3D drone maps of the terrain. The dataset also includes laser-induced breakdown spectroscopy (LIBS) measurements from key sampling sites along the rover's path, as well as weather station data to contextualize environmental conditions.
Our dataset consists of over 1000 fractured frescoes. The RePAIR stands as a realistic computational challenge for methods for 2D and 3D puzzle solving, and serves as a benchmark that enables the study of fractured object reassembly and presents new challenges for geometric shape understanding. Please visit our website for more dataset information, access to source code scripts and for an interactive gallery viewing of the dataset samples.
ThermoHands is the first benchmark dataset specifically designed for egocentric 3D hand pose estimation from thermal images. It addresses the challenges of hand pose estimation in low-light conditions and when the hand is occluded by gloves or other wearables—scenarios where traditional RGB or NIR-based systems struggle.
Dataset of the Beacon3D benchmark: Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis.
The combinatorial 3D shape dataset is composed of 406 instances of 14 classes. Specifically, each object in the dataset is considered equivalent to a sequence of primitive placement.
A new large-scale dataset that consists of 409 fine-grained categories and 31,881 images with accurate 3D pose annotation.
Involves data where a robot interacts with 5.1 cm colored blocks to complete an order-fulfillment style block stacking task. It contains dynamic scenes and real time-series data in a less constrained environment than comparable datasets. There are nearly 12,000 stacking attempts and over 2 million frames of real data.
The PointDenoisingBenchmark dataset features 28 different shapes, split into 18 training shapes and 10 test shapes.
The ShapeNet-Skeleton dataset has ground-truth skeleton point sets and skeletal volumes for object instances in the ShapeNet dataset.
SIDOD is a new, publicly-available image dataset generated by the NVIDIA Deep Learning Data Synthesizer intended for use in object detection, pose estimation, and tracking applications. This dataset contains 144k stereo image pairs that synthetically combine 18 camera viewpoints of three photorealistic virtual environments with up to 10 objects (chosen randomly from the 21 object models of the YCB dataset) and flying distractors.
The Store Dataset is a dataset for estimating 3D poses of multiple humans in real-time. It is captured inside two kinds of simulated stores with 12 and 28 cameras, respectively.
Minecraft Segmentation is a segmentation dataset for the Minecraft House that adds semantic segmentation labels for sub-components of the house. There are 2050 houses in total and 1038 distinct labels of subcomponents.
TDW is a 3D virtual world simulation platform, utilizing state-of-the-art video game engine technology. A TDW simulation consists of two components: a) the Build, a compiled executable running on the Unity3D Engine, which is responsible for image rendering, audio synthesis and physics simulations; and b) the Controller, an external Python interface to communicate with the build.
Sara motion is a 3D motion dataset, named Synthetic Actors and Real Actions (SARA), for training a model to produce motion embeddings suitable for reasoning about motion similarity.
AMT Objects is a large dataset of object centric videos suitable for training and benchmarking models for generating 3D models of objects from a small number of photos of the objects. The dataset consists of multiple views of a large collection of object instances.
Boombox is a multi-modal dataset for visual reconstruction from acoustic vibrations. Involves dropping objects into a box and capturing resulting images and vibrations. Used for training ML systems that predict images from vibration.
Mouse Brain MRI atlas (both in-vivo and ex-vivo) (repository relocated from the original webpage)