TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/H2O (2 Hands and Objects)

H2O (2 Hands and Objects)

Introduced 2021-04-21

We present a comprehensive framework for egocentric interaction recognition using markerless 3D annotations of two hands manipulating objects. To this end, we propose a method to create a unified dataset for egocentric 3D interaction recognition. Our method produces annotations of the 3D pose of two hands and the 6D pose of the manipulated objects, along with their interaction labels for each frame. Our dataset, called H2O (2 Hands and Objects), provides synchronized multi-view RGB-D images, interaction labels, object classes, ground-truth 3D poses for left & right hands, 6D object poses, ground-truth camera poses, object meshes and scene point clouds. To the best of our knowledge, this is the first benchmark that enables the study of first-person actions with the use of the pose of both left and right hands manipulating objects and presents an unprecedented level of detail for egocentric 3D interaction recognition. We further propose the method to predict interaction classes by estimating the 3D pose of two hands and the 6D pose of the manipulated objects, jointly from RGB images. Our method models both inter- and intra-dependencies between both hands and objects by learning the topology of a graph convolutional network that predicts interactions. We show that our method facilitated by this dataset establishes a strong baseline for joint hand-object pose estimation and achieves state-of-the-art accuracy for first person interaction recognition.

Benchmarks

3D Action Recognition/AccuracyAction Detection/AccuracyAction Localization/AccuracyAction Recognition/Actions Top-1Action Recognition/RGBAction Recognition/Hand PoseAction Recognition/Object PoseAction Recognition/Object LabelAction Recognition/AccuracyActivity Recognition/Actions Top-1Activity Recognition/RGBActivity Recognition/Hand PoseActivity Recognition/Object PoseActivity Recognition/Object LabelActivity Recognition/AccuracyTemporal Action Localization/AccuracyVideo/AccuracyZero-Shot Learning/Accuracy

Statistics

Papers
14
Benchmarks
18

Links

Homepage

Tasks

3D Action Recognition3D Hand Pose EstimationAction DetectionAction LocalizationAction RecognitionActivity RecognitionSkeleton Based Action RecognitionTemporal Action LocalizationVideoZero-Shot Learning