TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Human Pose as Compositional Tokens

Human Pose as Compositional Tokens

Zigang Geng, Chunyu Wang, Yixuan Wei, Ze Liu, Houqiang Li, Han Hu

2023-03-21CVPR 2023 1Pose Estimation
PaperPDFCode(official)

Abstract

Human pose is typically represented by a coordinate vector of body joints or their heatmap embeddings. While easy for data processing, unrealistic pose estimates are admitted due to the lack of dependency modeling between the body joints. In this paper, we present a structured representation, named Pose as Compositional Tokens (PCT), to explore the joint dependency. It represents a pose by M discrete tokens with each characterizing a sub-structure with several interdependent joints. The compositional design enables it to achieve a small reconstruction error at a low cost. Then we cast pose estimation as a classification task. In particular, we learn a classifier to predict the categories of the M tokens from an image. A pre-learned decoder network is used to recover the pose from the tokens without further post-processing. We show that it achieves better or comparable pose estimation results as the existing methods in general scenarios, yet continues to work well when occlusion occurs, which is ubiquitous in practice. The code and models are publicly available at https://github.com/Gengzigang/PCT.

Results

TaskDatasetMetricValueModel
Pose EstimationCOCO test-devAP78.3PCT (256x256)
Pose EstimationCOCO test-devAP509PCT (256x256)
Pose EstimationCOCO test-devAP7585.9PCT (256x256)
Pose EstimationMPII Human PosePCKh-0.594.3PCT (swin-l, test set)
Pose EstimationMPII Human PosePCKh-0.593.8PCT (swin-b, test set)
3DCOCO test-devAP78.3PCT (256x256)
3DCOCO test-devAP509PCT (256x256)
3DCOCO test-devAP7585.9PCT (256x256)
3DMPII Human PosePCKh-0.594.3PCT (swin-l, test set)
3DMPII Human PosePCKh-0.593.8PCT (swin-b, test set)
1 Image, 2*2 StitchiCOCO test-devAP78.3PCT (256x256)
1 Image, 2*2 StitchiCOCO test-devAP509PCT (256x256)
1 Image, 2*2 StitchiCOCO test-devAP7585.9PCT (256x256)
1 Image, 2*2 StitchiMPII Human PosePCKh-0.594.3PCT (swin-l, test set)
1 Image, 2*2 StitchiMPII Human PosePCKh-0.593.8PCT (swin-b, test set)

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16