TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/DirectPose: Direct End-to-End Multi-Person Pose Estimation

DirectPose: Direct End-to-End Multi-Person Pose Estimation

Zhi Tian, Hao Chen, Chunhua Shen

2019-11-18Pose EstimationMulti-Person Pose Estimation
PaperPDFCodeCodeCodeCodeCodeCodeCodeCodeCode

Abstract

We propose the first direct end-to-end multi-person pose estimation framework, termed DirectPose. Inspired by recent anchor-free object detectors, which directly regress the two corners of target bounding-boxes, the proposed framework directly predicts instance-aware keypoints for all the instances from a raw input image, eliminating the need for heuristic grouping in bottom-up methods or bounding-box detection and RoI operations in top-down ones. We also propose a novel Keypoint Alignment (KPAlign) mechanism, which overcomes the main difficulty: lack of the alignment between the convolutional features and predictions in this end-to-end framework. KPAlign improves the framework's performance by a large margin while still keeping the framework end-to-end trainable. With the only postprocessing non-maximum suppression (NMS), our proposed framework can detect multi-person keypoints with or without bounding-boxes in a single shot. Experiments demonstrate that the end-to-end paradigm can achieve competitive or better performance than previous strong baselines, in both bottom-up and top-down methods. We hope that our end-to-end approach can provide a new perspective for the human pose estimation task.

Results

TaskDatasetMetricValueModel
Pose EstimationCOCO test-devAP63.3DirectPose (ResNet-101)
Pose EstimationCOCO test-devAP5086.7DirectPose (ResNet-101)
Pose EstimationCOCO test-devAP7569.4DirectPose (ResNet-101)
Pose EstimationCOCO test-devAPL71.2DirectPose (ResNet-101)
Pose EstimationCOCO test-devAPM57.8DirectPose (ResNet-101)
Pose EstimationCOCO test-devAP64.8DirectPose (ResNet-101)
Pose EstimationCOCO test-devAP5087.8DirectPose (ResNet-101)
Pose EstimationCOCO test-devAP7571.1DirectPose (ResNet-101)
Pose EstimationCOCO test-devAPL71.5DirectPose (ResNet-101)
Pose EstimationCOCO test-devAPM60.4DirectPose (ResNet-101)
3DCOCO test-devAP63.3DirectPose (ResNet-101)
3DCOCO test-devAP5086.7DirectPose (ResNet-101)
3DCOCO test-devAP7569.4DirectPose (ResNet-101)
3DCOCO test-devAPL71.2DirectPose (ResNet-101)
3DCOCO test-devAPM57.8DirectPose (ResNet-101)
3DCOCO test-devAP64.8DirectPose (ResNet-101)
3DCOCO test-devAP5087.8DirectPose (ResNet-101)
3DCOCO test-devAP7571.1DirectPose (ResNet-101)
3DCOCO test-devAPL71.5DirectPose (ResNet-101)
3DCOCO test-devAPM60.4DirectPose (ResNet-101)
1 Image, 2*2 StitchiCOCO test-devAP63.3DirectPose (ResNet-101)
1 Image, 2*2 StitchiCOCO test-devAP5086.7DirectPose (ResNet-101)
1 Image, 2*2 StitchiCOCO test-devAP7569.4DirectPose (ResNet-101)
1 Image, 2*2 StitchiCOCO test-devAPL71.2DirectPose (ResNet-101)
1 Image, 2*2 StitchiCOCO test-devAPM57.8DirectPose (ResNet-101)
1 Image, 2*2 StitchiCOCO test-devAP64.8DirectPose (ResNet-101)
1 Image, 2*2 StitchiCOCO test-devAP5087.8DirectPose (ResNet-101)
1 Image, 2*2 StitchiCOCO test-devAP7571.1DirectPose (ResNet-101)
1 Image, 2*2 StitchiCOCO test-devAPL71.5DirectPose (ResNet-101)
1 Image, 2*2 StitchiCOCO test-devAPM60.4DirectPose (ResNet-101)

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16