TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/UniHPE: Towards Unified Human Pose Estimation via Contrast...

UniHPE: Towards Unified Human Pose Estimation via Contrastive Learning

Zhongyu Jiang, Wenhao Chai, Lei LI, Zhuoran Zhou, Cheng-Yen Yang, Jenq-Neng Hwang

2023-11-243D Human Pose Estimation2D Human Pose EstimationPose EstimationObject TrackingContrastive LearningAction Recognition
PaperPDF

Abstract

In recent times, there has been a growing interest in developing effective perception techniques for combining information from multiple modalities. This involves aligning features obtained from diverse sources to enable more efficient training with larger datasets and constraints, as well as leveraging the wealth of information contained in each modality. 2D and 3D Human Pose Estimation (HPE) are two critical perceptual tasks in computer vision, which have numerous downstream applications, such as Action Recognition, Human-Computer Interaction, Object tracking, etc. Yet, there are limited instances where the correlation between Image and 2D/3D human pose has been clearly researched using a contrastive paradigm. In this paper, we propose UniHPE, a unified Human Pose Estimation pipeline, which aligns features from all three modalities, i.e., 2D human pose estimation, lifting-based and image-based 3D human pose estimation, in the same pipeline. To align more than two modalities at the same time, we propose a novel singular value based contrastive learning loss, which better aligns different modalities and further boosts the performance. In our evaluation, UniHPE achieves remarkable performance metrics: MPJPE $50.5$mm on the Human3.6M dataset and PAMPJPE $51.6$mm on the 3DPW dataset. Our proposed method holds immense potential to advance the field of computer vision and contribute to various applications.

Results

TaskDatasetMetricValueModel
3D Human Pose Estimation3DPWPA-MPJPE51.6UniHPE (GT)
3D Human Pose Estimation3DPWPA-MPJPE65.7UniHPE-w48
Pose Estimation3DPWPA-MPJPE51.6UniHPE (GT)
Pose Estimation3DPWPA-MPJPE65.7UniHPE-w48
3D3DPWPA-MPJPE51.6UniHPE (GT)
3D3DPWPA-MPJPE65.7UniHPE-w48
1 Image, 2*2 Stitchi3DPWPA-MPJPE51.6UniHPE (GT)
1 Image, 2*2 Stitchi3DPWPA-MPJPE65.7UniHPE-w48

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17MVA 2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results2025-07-17SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17