TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Rethinking Keypoint Representations: Modeling Keypoints an...

Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation

William McNally, Kanav Vats, Alexander Wong, John McPhee

2021-11-16Pose EstimationKeypoint Estimation
PaperPDFCode(official)

Abstract

In keypoint estimation tasks such as human pose estimation, heatmap-based regression is the dominant approach despite possessing notable drawbacks: heatmaps intrinsically suffer from quantization error and require excessive computation to generate and post-process. Motivated to find a more efficient solution, we propose to model individual keypoints and sets of spatially related keypoints (i.e., poses) as objects within a dense single-stage anchor-based detection framework. Hence, we call our method KAPAO (pronounced "Ka-Pow"), for Keypoints And Poses As Objects. KAPAO is applied to the problem of single-stage multi-person human pose estimation by simultaneously detecting human pose and keypoint objects and fusing the detections to exploit the strengths of both object representations. In experiments, we observe that KAPAO is faster and more accurate than previous methods, which suffer greatly from heatmap post-processing. The accuracy-speed trade-off is especially favourable in the practical setting when not using test-time augmentation. Source code: https://github.com/wmcnally/kapao.

Results

TaskDatasetMetricValueModel
Pose EstimationCOCO test-devAP70.3KAPAO-L
Pose EstimationCOCO test-devAP5091.2KAPAO-L
Pose EstimationCOCO test-devAP7577.8KAPAO-L
Pose EstimationCOCO test-devAPL76.8KAPAO-L
Pose EstimationCOCO test-devAPM66.3KAPAO-L
Pose EstimationCOCO test-devAR77.7KAPAO-L
Pose EstimationCOCO test-devAP68.8KAPAO-M
Pose EstimationCOCO test-devAP5090.5KAPAO-M
Pose EstimationCOCO test-devAP7576.5KAPAO-M
Pose EstimationCOCO test-devAPL76KAPAO-M
Pose EstimationCOCO test-devAPM64.3KAPAO-M
Pose EstimationCOCO test-devAR76.3KAPAO-M
Pose EstimationCOCO test-devAP63.8KAPAO-S
Pose EstimationCOCO test-devAP5088.4KAPAO-S
Pose EstimationCOCO test-devAP7570.4KAPAO-S
Pose EstimationCOCO test-devAPL71.7KAPAO-S
Pose EstimationCOCO test-devAPM58.6KAPAO-S
Pose EstimationCOCO test-devAR71.2KAPAO-S
Pose EstimationCrowdPoseAP68.9KAPAO-L
Pose EstimationCrowdPoseAP5089.4KAPAO-L
Pose EstimationCrowdPoseAP7575.6KAPAO-L
Pose EstimationCrowdPoseAPM69.9KAPAO-L
Pose EstimationCrowdPoseTest76.6KAPAO-L
Pose EstimationCrowdPoseAP67.1KAPAO-M
Pose EstimationCrowdPoseAP5088.8KAPAO-M
Pose EstimationCrowdPoseAP7573.4KAPAO-M
Pose EstimationCrowdPoseAPM68.1KAPAO-M
Pose EstimationCrowdPoseTest75.2KAPAO-M
Pose EstimationCrowdPoseAP63.8KAPAO-S
Pose EstimationCrowdPoseAP5087.7KAPAO-S
Pose EstimationCrowdPoseAP7569.4KAPAO-S
Pose EstimationCrowdPoseAPM64.8KAPAO-S
Pose EstimationCrowdPoseTest72.1KAPAO-S
3DCOCO test-devAP70.3KAPAO-L
3DCOCO test-devAP5091.2KAPAO-L
3DCOCO test-devAP7577.8KAPAO-L
3DCOCO test-devAPL76.8KAPAO-L
3DCOCO test-devAPM66.3KAPAO-L
3DCOCO test-devAR77.7KAPAO-L
3DCOCO test-devAP68.8KAPAO-M
3DCOCO test-devAP5090.5KAPAO-M
3DCOCO test-devAP7576.5KAPAO-M
3DCOCO test-devAPL76KAPAO-M
3DCOCO test-devAPM64.3KAPAO-M
3DCOCO test-devAR76.3KAPAO-M
3DCOCO test-devAP63.8KAPAO-S
3DCOCO test-devAP5088.4KAPAO-S
3DCOCO test-devAP7570.4KAPAO-S
3DCOCO test-devAPL71.7KAPAO-S
3DCOCO test-devAPM58.6KAPAO-S
3DCOCO test-devAR71.2KAPAO-S
3DCrowdPoseAP68.9KAPAO-L
3DCrowdPoseAP5089.4KAPAO-L
3DCrowdPoseAP7575.6KAPAO-L
3DCrowdPoseAPM69.9KAPAO-L
3DCrowdPoseTest76.6KAPAO-L
3DCrowdPoseAP67.1KAPAO-M
3DCrowdPoseAP5088.8KAPAO-M
3DCrowdPoseAP7573.4KAPAO-M
3DCrowdPoseAPM68.1KAPAO-M
3DCrowdPoseTest75.2KAPAO-M
3DCrowdPoseAP63.8KAPAO-S
3DCrowdPoseAP5087.7KAPAO-S
3DCrowdPoseAP7569.4KAPAO-S
3DCrowdPoseAPM64.8KAPAO-S
3DCrowdPoseTest72.1KAPAO-S
1 Image, 2*2 StitchiCOCO test-devAP70.3KAPAO-L
1 Image, 2*2 StitchiCOCO test-devAP5091.2KAPAO-L
1 Image, 2*2 StitchiCOCO test-devAP7577.8KAPAO-L
1 Image, 2*2 StitchiCOCO test-devAPL76.8KAPAO-L
1 Image, 2*2 StitchiCOCO test-devAPM66.3KAPAO-L
1 Image, 2*2 StitchiCOCO test-devAR77.7KAPAO-L
1 Image, 2*2 StitchiCOCO test-devAP68.8KAPAO-M
1 Image, 2*2 StitchiCOCO test-devAP5090.5KAPAO-M
1 Image, 2*2 StitchiCOCO test-devAP7576.5KAPAO-M
1 Image, 2*2 StitchiCOCO test-devAPL76KAPAO-M
1 Image, 2*2 StitchiCOCO test-devAPM64.3KAPAO-M
1 Image, 2*2 StitchiCOCO test-devAR76.3KAPAO-M
1 Image, 2*2 StitchiCOCO test-devAP63.8KAPAO-S
1 Image, 2*2 StitchiCOCO test-devAP5088.4KAPAO-S
1 Image, 2*2 StitchiCOCO test-devAP7570.4KAPAO-S
1 Image, 2*2 StitchiCOCO test-devAPL71.7KAPAO-S
1 Image, 2*2 StitchiCOCO test-devAPM58.6KAPAO-S
1 Image, 2*2 StitchiCOCO test-devAR71.2KAPAO-S
1 Image, 2*2 StitchiCrowdPoseAP68.9KAPAO-L
1 Image, 2*2 StitchiCrowdPoseAP5089.4KAPAO-L
1 Image, 2*2 StitchiCrowdPoseAP7575.6KAPAO-L
1 Image, 2*2 StitchiCrowdPoseAPM69.9KAPAO-L
1 Image, 2*2 StitchiCrowdPoseTest76.6KAPAO-L
1 Image, 2*2 StitchiCrowdPoseAP67.1KAPAO-M
1 Image, 2*2 StitchiCrowdPoseAP5088.8KAPAO-M
1 Image, 2*2 StitchiCrowdPoseAP7573.4KAPAO-M
1 Image, 2*2 StitchiCrowdPoseAPM68.1KAPAO-M
1 Image, 2*2 StitchiCrowdPoseTest75.2KAPAO-M
1 Image, 2*2 StitchiCrowdPoseAP63.8KAPAO-S
1 Image, 2*2 StitchiCrowdPoseAP5087.7KAPAO-S
1 Image, 2*2 StitchiCrowdPoseAP7569.4KAPAO-S
1 Image, 2*2 StitchiCrowdPoseAPM64.8KAPAO-S
1 Image, 2*2 StitchiCrowdPoseTest72.1KAPAO-S

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16