TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Pixel-wise Regression: 3D Hand Pose Estimation via Spatial...

Pixel-wise Regression: 3D Hand Pose Estimation via Spatial-form Representation and Differentiable Decoder

Xingyuan Zhang, Fuhai Zhang

2019-05-063D Hand Pose EstimationregressionFormPose EstimationHand Pose Estimation
PaperPDFCode(official)

Abstract

3D Hand pose estimation from a single depth image is an essential topic in computer vision and human-computer interaction. Although the rising of deep learning method boosts the accuracy a lot, the problem is still hard to solve due to the complex structure of the human hand. Existing methods with deep learning either lose spatial information of hand structure or lack a direct supervision of joint coordinates. In this paper, we propose a novel Pixel-wise Regression method, which use spatial-form representation (SFR) and differentiable decoder (DD) to solve the two problems. To use our method, we build a model, in which we design a particular SFR and its correlative DD which divided the 3D joint coordinates into two parts, plane coordinates and depth coordinates and use two modules named Plane Regression (PR) and Depth Regression (DR) to deal with them respectively. We conduct an ablation experiment to show the method we proposed achieve better results than the former methods. We also make an exploration on how different training strategies influence the learned SFRs and results. The experiment on three public datasets demonstrates that our model is comparable with the existing state-of-the-art models and in one of them our model can reduce mean 3D joint error by 25%.

Results

TaskDatasetMetricValueModel
HandMSRA HandsAverage 3D Error7.985PixelwiseRegression
HandICVL HandsAverage 3D Error6.152PixelwiseRegression
HandNYU HandsAverage 3D Error9.173PixelwiseRegression
HandHANDS 2017Average 3D Error10.57PixelwiseRegression
Pose EstimationMSRA HandsAverage 3D Error7.985PixelwiseRegression
Pose EstimationICVL HandsAverage 3D Error6.152PixelwiseRegression
Pose EstimationNYU HandsAverage 3D Error9.173PixelwiseRegression
Pose EstimationHANDS 2017Average 3D Error10.57PixelwiseRegression
Hand Pose EstimationMSRA HandsAverage 3D Error7.985PixelwiseRegression
Hand Pose EstimationICVL HandsAverage 3D Error6.152PixelwiseRegression
Hand Pose EstimationNYU HandsAverage 3D Error9.173PixelwiseRegression
Hand Pose EstimationHANDS 2017Average 3D Error10.57PixelwiseRegression
3DMSRA HandsAverage 3D Error7.985PixelwiseRegression
3DICVL HandsAverage 3D Error6.152PixelwiseRegression
3DNYU HandsAverage 3D Error9.173PixelwiseRegression
3DHANDS 2017Average 3D Error10.57PixelwiseRegression
1 Image, 2*2 StitchiMSRA HandsAverage 3D Error7.985PixelwiseRegression
1 Image, 2*2 StitchiICVL HandsAverage 3D Error6.152PixelwiseRegression
1 Image, 2*2 StitchiNYU HandsAverage 3D Error9.173PixelwiseRegression
1 Image, 2*2 StitchiHANDS 2017Average 3D Error10.57PixelwiseRegression

Related Papers

Language Integration in Fine-Tuning Multimodal Large Language Models for Image-Based Regression2025-07-20$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17Neural Network-Guided Symbolic Regression for Interpretable Descriptor Discovery in Perovskite Catalysts2025-07-16Imbalanced Regression Pipeline Recommendation2025-07-16