TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Keypoint Transformer: Solving Joint Identification in Chal...

Keypoint Transformer: Solving Joint Identification in Challenging Hands and Object Interactions for Accurate 3D Pose Estimation

Shreyas Hampali, Sayan Deb Sarkar, Mahdi Rad, Vincent Lepetit

2021-04-29CVPR 2022 13D Hand Pose Estimation3D Interacting Hand Pose Estimationhand-object posePose Estimation3D Pose Estimation
PaperPDFCode(official)

Abstract

We propose a robust and accurate method for estimating the 3D poses of two hands in close interaction from a single color image. This is a very challenging problem, as large occlusions and many confusions between the joints may happen. State-of-the-art methods solve this problem by regressing a heatmap for each joint, which requires solving two problems simultaneously: localizing the joints and recognizing them. In this work, we propose to separate these tasks by relying on a CNN to first localize joints as 2D keypoints, and on self-attention between the CNN features at these keypoints to associate them with the corresponding hand joint. The resulting architecture, which we call "Keypoint Transformer", is highly efficient as it achieves state-of-the-art performance with roughly half the number of model parameters on the InterHand2.6M dataset. We also show it can be easily extended to estimate the 3D pose of an object manipulated by one or two hands with high performance. Moreover, we created a new dataset of more than 75,000 images of two hands manipulating an object fully annotated in 3D and will make it publicly available.

Results

TaskDatasetMetricValueModel
HandHO-3D v3AUC_J0.785KPT-Transformer
HandHO-3D v3PA-MPJPE10.9KPT-Transformer
HandHO-3D v2AUC_J0.786KPT-Transformer
HandHO-3D v2PA-MPJPE (mm)10.8KPT-Transformer
HandHO-3D v2ADD-S21.4Keypoint-Trans
HandHO-3D v2Average MPJPE (mm)25.5Keypoint-Trans
HandHO-3D v2OME68Keypoint-Trans
HandHO-3D v2PA-MPJPE10.8Keypoint-Trans
HandHO-3D v2ST-MPJPE25.7Keypoint-Trans
Pose EstimationHO-3D v2ADD-S21.4Keypoint-Trans
Pose EstimationHO-3D v2Average MPJPE (mm)25.5Keypoint-Trans
Pose EstimationHO-3D v2OME68Keypoint-Trans
Pose EstimationHO-3D v2PA-MPJPE10.8Keypoint-Trans
Pose EstimationHO-3D v2ST-MPJPE25.7Keypoint-Trans
Pose EstimationHO-3D v3AUC_J0.785KPT-Transformer
Pose EstimationHO-3D v3PA-MPJPE10.9KPT-Transformer
Pose EstimationHO-3D v2AUC_J0.786KPT-Transformer
Pose EstimationHO-3D v2PA-MPJPE (mm)10.8KPT-Transformer
Hand Pose EstimationHO-3D v3AUC_J0.785KPT-Transformer
Hand Pose EstimationHO-3D v3PA-MPJPE10.9KPT-Transformer
Hand Pose EstimationHO-3D v2AUC_J0.786KPT-Transformer
Hand Pose EstimationHO-3D v2PA-MPJPE (mm)10.8KPT-Transformer
Hand Pose EstimationHO-3D v2ADD-S21.4Keypoint-Trans
Hand Pose EstimationHO-3D v2Average MPJPE (mm)25.5Keypoint-Trans
Hand Pose EstimationHO-3D v2OME68Keypoint-Trans
Hand Pose EstimationHO-3D v2PA-MPJPE10.8Keypoint-Trans
Hand Pose EstimationHO-3D v2ST-MPJPE25.7Keypoint-Trans
3DHO-3D v2ADD-S21.4Keypoint-Trans
3DHO-3D v2Average MPJPE (mm)25.5Keypoint-Trans
3DHO-3D v2OME68Keypoint-Trans
3DHO-3D v2PA-MPJPE10.8Keypoint-Trans
3DHO-3D v2ST-MPJPE25.7Keypoint-Trans
3DHO-3D v3AUC_J0.785KPT-Transformer
3DHO-3D v3PA-MPJPE10.9KPT-Transformer
3DHO-3D v2AUC_J0.786KPT-Transformer
3DHO-3D v2PA-MPJPE (mm)10.8KPT-Transformer
3D Hand Pose EstimationHO-3D v3AUC_J0.785KPT-Transformer
3D Hand Pose EstimationHO-3D v3PA-MPJPE10.9KPT-Transformer
3D Hand Pose EstimationHO-3D v2AUC_J0.786KPT-Transformer
3D Hand Pose EstimationHO-3D v2PA-MPJPE (mm)10.8KPT-Transformer
3D Hand Pose EstimationHO-3D v2ADD-S21.4Keypoint-Trans
3D Hand Pose EstimationHO-3D v2Average MPJPE (mm)25.5Keypoint-Trans
3D Hand Pose EstimationHO-3D v2OME68Keypoint-Trans
3D Hand Pose EstimationHO-3D v2PA-MPJPE10.8Keypoint-Trans
3D Hand Pose EstimationHO-3D v2ST-MPJPE25.7Keypoint-Trans
6D Pose EstimationHO-3D v2ADD-S21.4Keypoint-Trans
6D Pose EstimationHO-3D v2Average MPJPE (mm)25.5Keypoint-Trans
6D Pose EstimationHO-3D v2OME68Keypoint-Trans
6D Pose EstimationHO-3D v2PA-MPJPE10.8Keypoint-Trans
6D Pose EstimationHO-3D v2ST-MPJPE25.7Keypoint-Trans
1 Image, 2*2 StitchiHO-3D v2ADD-S21.4Keypoint-Trans
1 Image, 2*2 StitchiHO-3D v2Average MPJPE (mm)25.5Keypoint-Trans
1 Image, 2*2 StitchiHO-3D v2OME68Keypoint-Trans
1 Image, 2*2 StitchiHO-3D v2PA-MPJPE10.8Keypoint-Trans
1 Image, 2*2 StitchiHO-3D v2ST-MPJPE25.7Keypoint-Trans
1 Image, 2*2 StitchiHO-3D v3AUC_J0.785KPT-Transformer
1 Image, 2*2 StitchiHO-3D v3PA-MPJPE10.9KPT-Transformer
1 Image, 2*2 StitchiHO-3D v2AUC_J0.786KPT-Transformer
1 Image, 2*2 StitchiHO-3D v2PA-MPJPE (mm)10.8KPT-Transformer
3D Interacting Hand Pose EstimationInterHand2.6MMPJPE Test12.78Keypoint Transformer
3D Interacting Hand Pose EstimationInterHand2.6MMRRPE Test29.63Keypoint Transformer

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16