Datasets

383 machine learning datasets

383 dataset results

ZInd (Zillow Indoor Dataset)

The Zillow Indoor Dataset (ZInD) provides extensive visual data that covers a real world distribution of unfurnished residential homes. It consists of primary 360º panoramas with annotated room layouts, windows, doors and openings (W/D/O), merged rooms, secondary localized panoramas, and final 2D floor plans. The figure above illustrates the various representations (from left to right beyond capture): Room layout with W/D/O annotations, merged layouts, 3D textured mesh, and final 2D floor plan.

18 papers8 benchmarks3D, Images

FineDance

Click to add a brief description of the dataset (Markdown and LaTeX enabled).

18 papers8 benchmarks3D, Music

PU1K

PU1K is nearly 8 times larger than the largest publicly available dataset collected by PU-GAN. PU1K consists of 1,147 3D models split into 1020 training samples and 127 testing samples. The training set contains 120 3D models compiled from PU-GAN’s dataset, in addition to 900 different models collected from ShapeNetCore. The testing set contains 27 models from PU-GAN and 100 more models from ShapeNetCore.

18 papers0 benchmarks3D

Wild-Places

Many existing datasets for lidar place recognition are solely representative of structured urban environments, and have recently been saturated in performance by deep learning based approaches. Natural and unstructured environments present many additional challenges for the tasks of long-term localisation but these environments are not represented in currently available datasets. To address this we introduce Wild-Places, a challenging large-scale dataset for lidar place recognition in unstructured, natural environments. Wild-Places contains eight lidar sequences collected with a handheld sensor payload over the course of fourteen months, containing a total of 63K undistorted lidar submaps along with accurate 6DoF ground truth. This dataset contains multiple revisits both within and between sequences, allowing for both intra-sequence (i.e., loop closure detection) and inter-sequence (i.e., re-localisation) tasks. We also benchmark several state-of-the-art approaches to demonstrate t

18 papers2 benchmarks3D, LiDAR

SceneNet

SceneNet is a dataset of labelled synthetic indoor scenes. There are several labeled indoor scenes, including:

17 papers0 benchmarks3D, Images

ViViD++ (Vision for Visibility Dataset)

A dataset capturing diverse visual data formats that target varying luminance conditions, and was recorded from alternative vision sensors, by handheld or mounted on a car, repeatedly in the same space but in different conditions.

17 papers0 benchmarks3D, Images, LiDAR, RGB Video, RGB-D

VehicleX

VehicleX is a large-scale synthetic dataset. Created in Unity, it contains 1,362 vehicles of various 3D models with fully editable attributes.

16 papers0 benchmarks3D, Images

DublinCity

A novel benchmark dataset that includes a manually annotated point cloud for over 260 million laser scanning points into 100'000 (approx.) assets from Dublin LiDAR point cloud [12] in 2015. Objects are labelled into 13 classes using hierarchical levels of detail from large (i.e., building, vegetation and ground) to refined (i.e., window, door and tree) elements.

15 papers0 benchmarks3D, LiDAR, Point cloud

3D AffordanceNet

3D AffordanceNet is a dataset of 23k shapes for visual affordance. It consists of 56,307 well-defined affordance information annotations for 22,949 shapes covering 18 affordance classes and 23 semantic object categories.

15 papers2 benchmarks3D, 3d meshes

CustomHumans

CustomHumans is recorded by a multi-view photogrammetry system equipped with 53 RGB (12 Megapixels) and 53 (4 Megapixels) IR cameras. The resulting high-quality scan is composed of a 40K-face mesh alongside a 4K-resolution texture map. In addition to the high-quality scans, CustomHumans provides accurately registered SMPL-X parameters using a customized mesh registration pipeline. 80 participants are invited for the data capturing. Each of them is instructed to perform several movements, such as "T-pose", "Hands Up'", "Squat'", "Turing head'', and "Hand gestures", in a 10-second long sequence (300 frames). 4-5 best-quality meshes in each sequence are selected as the data samples. In total, the dataset contains more than 600 high-quality scans with 120 different garments.

15 papers4 benchmarks3D

SIZER

Dataset of clothing size variation which includes different subjects wearing casual clothing items in various sizes, totaling to approximately 2000 scans. This dataset includes the scans, registrations to the SMPL model, scans segmented in clothing parts, garment category and size labels.

14 papers0 benchmarks3D

WHU

Created for MVS tasks and is a large-scale multi-view aerial dataset generated from a highly accurate 3D digital surface model produced from thousands of real aerial images with precise camera parameters.

14 papers0 benchmarks3D, Images

SynthRAD2023

Purpose Medical imaging has become increasingly important in diagnosing and treating oncological patients, particularly in radiotherapy. Recent advances in synthetic computed tomography (sCT) generation have increased interest in public challenges to provide data and evaluation metrics for comparing different approaches openly. This paper describes a dataset of brain and pelvis computed tomography (CT) images with rigidly registered cone-beam CT (CBCT) and magnetic resonance imaging (MRI) images to facilitate the development and evaluation of sCT generation for radiotherapy planning.

14 papers0 benchmarks3D, Images, Medical

Super-CLEVR

Super-CLEVR is a dataset for Visual Question Answering (VQA) where different factors in VQA domain shifts can be isolated in order that their effects can be studied independently. It contains 21 vehicle models belonging to 5 categories, with controllable attributes. Four factors are considered: visual complexity, question redundancy, concept distribution and concept compositionality.

13 papers0 benchmarks3D

SILK (Synth It Like KITTI)

An important factor in advancing autonomous driving systems is simulation. Yet, there is rather small progress for transferability between the virtual and real world. We revisit this problem for 3D object detection on LiDAR point clouds and propose a dataset generation pipeline based on the CARLA simulator. Utilizing domain randomization strategies and careful modeling, we are able to train an object detector on the synthetic data and demonstrate strong generalization capabilities to the KITTI dataset.

12 papers0 benchmarks3D, Images

House3D Environment

A rich, extensible and efficient environment that contains 45,622 human-designed 3D scenes of visually realistic houses, ranging from single-room studios to multi-storied houses, equipped with a diverse set of fully labeled 3D objects, textures and scene layouts, based on the SUNCG dataset (Song et.al.)

11 papers0 benchmarks3D, Environment

BIKED

BIKED is a dataset comprised of 4500 individually designed bicycle models sourced from hundreds of designers. BIKED enables a variety of data-driven design applications for bicycles and generally supports the development of data-driven design methods. The dataset is comprised of a variety of design information including assembly images, component images, numerical design parameters, and class labels.

11 papers0 benchmarks3D, Cad

BuildingNet

BuildingNet is a large-scale dataset of 3D building models whose exteriors are consistently labeled. The dataset consists on 513K annotated mesh primitives, grouped into 292K semantic part components across 2K building models. The dataset covers several building categories, such as houses, churches, skyscrapers, town halls, libraries, and castles.

11 papers0 benchmarks3D, 3d meshes

4D-OR

4D-OR includes a total of 6734 scenes, recorded by six calibrated RGB-D Kinect sensors 1 mounted to the ceiling of the OR, with one frame-per-second, providing synchronized RGB and depth images. We provide fused point cloud sequences of entire scenes, automatically annotated human 6D poses and 3D bounding boxes for OR objects. Furthermore, we provide SSG annotations for each step of the surgery together with the clinical roles of all the humans in the scenes, e.g., nurse, head surgeon, anesthesiologist.

11 papers7 benchmarks3D, Graphs, Images, Medical, Point cloud, RGB Video, RGB-D, Time series, Videos

UnrealEgo

UnrealEgo is a dataset that provides in-the-wild stereo images with a large variety of motions for 3D human pose estimation. The in-the-wild stereo images are stereo fisheye images and depth maps with a resolution of 1024×1024 pixels each with 25 frames per second and a total of 450k (900k images) are captured for the dataset. Metadata is provided for each frame, including 3D joint positions, camera positions, and 2D coordinates of reprojected joint positions in the fisheye views.

11 papers8 benchmarks3D, Images

PreviousPage 6 of 20Next