TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Towards Good Practices for Deep 3D Hand Pose Estimation

Towards Good Practices for Deep 3D Hand Pose Estimation

Hengkai Guo, Guijin Wang, Xinghao Chen, Cairong Zhang

2017-07-233D Hand Pose EstimationData AugmentationFingertip DetectionPose EstimationHand Pose Estimation
PaperPDF

Abstract

3D hand pose estimation from single depth image is an important and challenging problem for human-computer interaction. Recently deep convolutional networks (ConvNet) with sophisticated design have been employed to address it, but the improvement over traditional random forest based methods is not so apparent. To exploit the good practice and promote the performance for hand pose estimation, we propose a tree-structured Region Ensemble Network (REN) for directly 3D coordinate regression. It first partitions the last convolution outputs of ConvNet into several grid regions. The results from separate fully-connected (FC) regressors on each regions are then integrated by another FC layer to perform the estimation. By exploitation of several training strategies including data augmentation and smooth $L_1$ loss, proposed REN can significantly improve the performance of ConvNet to localize hand joints. The experimental results demonstrate that our approach achieves the best performance among state-of-the-art algorithms on three public hand pose datasets. We also experiment our methods on fingertip detection and human pose datasets and obtain state-of-the-art accuracy.

Results

TaskDatasetMetricValueModel
HandICVL HandsAverage 3D Error7.31Tree Region Ensemble Network
HandNYU HandsAverage 3D Error15.6REN
Pose EstimationITOP top-viewMean mAP75.5REN
Pose Estimation ITOP front-viewMean mAP84.9REN
Pose EstimationICVL HandsAverage 3D Error7.31Tree Region Ensemble Network
Pose EstimationNYU HandsAverage 3D Error15.6REN
Hand Pose EstimationICVL HandsAverage 3D Error7.31Tree Region Ensemble Network
Hand Pose EstimationNYU HandsAverage 3D Error15.6REN
3DITOP top-viewMean mAP75.5REN
3D ITOP front-viewMean mAP84.9REN
3DICVL HandsAverage 3D Error7.31Tree Region Ensemble Network
3DNYU HandsAverage 3D Error15.6REN
1 Image, 2*2 StitchiITOP top-viewMean mAP75.5REN
1 Image, 2*2 Stitchi ITOP front-viewMean mAP84.9REN
1 Image, 2*2 StitchiICVL HandsAverage 3D Error7.31Tree Region Ensemble Network
1 Image, 2*2 StitchiNYU HandsAverage 3D Error15.6REN

Related Papers

Overview of the TalentCLEF 2025: Skill and Job Title Intelligence for Human Capital Management2025-07-17Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16