Datasets

192 machine learning datasets

192 dataset results

Sugar Beets 2016

Sugar Beets 2016 is a robot dataset for plant classification as well as localization and mapping that covers the relevant stages for robotic intervention and weed control. It contains around 5TB of data recorded from a robot with a 4-channel multi-spectral camera and a RGB-D sensor to capture detailed information about the plantation.

0 papers0 benchmarksImages, RGB-D

Robot@Home dataset

The Robot-at-Home dataset (Robot@Home) is a collection of raw and processed data from five domestic settings compiled by a mobile robot equipped with 4 RGB-D cameras and a 2D laser scanner. Its main purpose is to serve as a testbed for semantic mapping algorithms through the categorization of objects and/or rooms.

0 papers0 benchmarksImages, LiDAR, RGB-D, Videos

MHRI dataset (Multimodal Human-Robot Interaction dataset)

The dataset includes recordings from 10 different users teaching the robot different common kitchen objects, that consists of synchronized recordings from three cameras and a microphone mounted on the robot:

0 papers0 benchmarksAudio, Images, RGB-D

Toronto NeuroFace Dataset

Toronto NeuroFace Dataset: A New Dataset for Facial Motion Analysis in Individuals with Neurological Disorders

0 papers0 benchmarksImages, Medical, RGB-D, Videos

DAHLIA (DAily Human Life Activity)

DAHLIA dataset [1] is devoted to human activity recognition, which is a major issue for adapting smart-home services such as user assistance. DAHLIA has been realized in Mobile Mii Platform by CEA LIST, and has been partly supported by ITEA 3 Emospaces Project (https://itea3.org/project/emospaces.html)

0 papers0 benchmarksRGB-D, Videos

InfiniteRep

InfiniteRep is a synthetic, open-source dataset for fitness and physical therapy (PT) applications. It includes 1k videos of diverse avatars performing multiple repetitions of common exercises. It includes significant variation in the environment, lighting conditions, avatar demographics, and movement trajectories. From cadence to kinematic trajectory, each rep is done slightly differently -- just like real humans. InfiniteRep videos are accompanied by a rich set of pixel-perfect labels and annotations, including frame-specific repetition counts.

0 papers0 benchmarks3D, 3d meshes, Actions, Biomedical, Images, RGB Video, RGB-D, Tracking, Videos

Pose Estimation Lunar Robot (Dataset for camera pose estimation research using computer simulated images from rovers on the lunar surface)

Overview The goal: using simulation data to train neural networks to estimate the pose of a rover's camera with respect to a known target object

0 papers0 benchmarksImages, Point cloud, RGB-D

UNIPD-BPE (University of Padova Body Pose Estimation)

The University of Padova Body Pose Estimation dataset (UNIPD-BPE) is an extensive dataset for multi-sensor body pose estimation containing both single-person and multi-person sequences with up to 4 interacting people A network with 5 Microsoft Azure Kinect RGB-D cameras is exploited to record synchronized high-definition RGB and depth data of the scene from multiple viewpoints, as well as to estimate the subjects’ poses using the Azure Kinect Body Tracking SDK. Simultaneously, full-body Xsens MVN Awinda inertial suits allow obtaining accurate poses and anatomical joint angles, while also providing raw data from the 17 IMUs required by each suit. All the cameras and inertial suits are hardware synchronized, while the relative poses of each camera with respect to the inertial reference frame are calibrated before each sequence to ensure maximum overlap of the two sensing systems outputs.

0 papers0 benchmarksRGB-D, Tracking

HEADSET (HEADSET: Human Emotion Awareness under Partial Occlusions Multimodal DataSET)

The volumetric representation of human interactions is one of the fundamental domains in the development of immersive media productions and telecommunication applications. Particularly in the context of the rapid advancement of Extended Reality (XR) applications, this volumetric data has proven to be an essential technology for future XR elaboration. In this work, we present a new multimodal database to help advance the development of immersive technologies. Our proposed database provides ethically compliant and diverse volumetric data, in particular 27 participants displaying posed facial expressions and subtle body movements while speaking, plus 11 participants wearing head-mounted displays (HMDs). The recording system consists of a volumetric capture (VoCap) studio, including 31 synchronized modules with 62 RGB cameras and 31 depth cameras. In addition to textured meshes, point clouds, and multi-view RGB-D data, we use one Lytro Illum camera for providing light field (LF) data simul

0 papers0 benchmarks3D, 3d meshes, Audio, Images, Point cloud, RGB Video, RGB-D, Videos

HouseCat6D (A Large-Scale Multi-Modal Category Level 6D Object Perception Dataset with Household Objects in Realistic Scenarios)

Estimating 6D object poses is a major challenge in 3D computer vision. Building on successful instance-level approaches, research is shifting towards category-level pose estimation for practical applications. Current categorylevel datasets, however, fall short in annotation quality and pose variety. Addressing this, we introduce HouseCat6D, a new category-level 6D pose dataset. It features 1) multimodality with Polarimetric RGB and Depth (RGBD+P), 2) encompasses 194 diverse objects across 10 household categories, including two photometrically challenging ones, and 3) provides high-quality pose annotations with an error range of only 1.35 mm to 1.74 mm. The dataset also includes 4) 41 large-scale scenes with comprehensive viewpoint and occlusion coverage, 5) a checkerboard-free environment, and 6) dense 6D parallel-jaw robotic grasp annotations. Additionally, we present benchmark results for leading category-level pose estimation networks.

0 papers0 benchmarksRGB-D

Primitive Shape Abstraction

Dataset: RGB-D Images for Real-World and Synthetic Object Scenes This dataset consists of both real-world and synthetic RGB-D images, designed for object detection, classification, and segmentation tasks, particularly for primitive shape recognition.

0 papers0 benchmarksRGB-D

IITKGP_Fence Dataset

Overview The IITKGP_Fence dataset is designed for tasks related to fence-like occlusion detection, defocus blur, depth mapping, and object segmentation. The captured data vaies in scene composition, background defocus, and object occlusions. The dataset comprises both labeled and unlabeled data, as well as additional video and RGB-D data. The contains ground truth occlusion masks (GT) for the corresponding images. We created the ground truth occlusion labels in a semi-automatic way with user interaction.

0 papers0 benchmarksImages, RGB Video, RGB-D

PreviousPage 10 of 10