TransPose: Keypoint Localization via Transformer

Sen yang, Zhibin Quan, Mu Nie, Wankou Yang

2020-12-28ICCV 2021 10Pose Estimation Multi-Person Pose Estimation Keypoint Detection

Abstract

While CNN-based models have made remarkable progress on human pose estimation, what spatial dependencies they capture to localize keypoints remains unclear. In this work, we propose a model called \textbf{TransPose}, which introduces Transformer for human pose estimation. The attention layers built in Transformer enable our model to capture long-range relationships efficiently and also can reveal what dependencies the predicted keypoints rely on. To predict keypoint heatmaps, the last attention layer acts as an aggregator, which collects contributions from image clues and forms maximum positions of keypoints. Such a heatmap-based localization approach via Transformer conforms to the principle of Activation Maximization~\cite{erhan2009visualizing}. And the revealed dependencies are image-specific and fine-grained, which also can provide evidence of how the model handles special cases, e.g., occlusion. The experiments show that TransPose achieves 75.8 AP and 75.0 AP on COCO validation and test-dev sets, while being more lightweight and faster than mainstream CNN architectures. The TransPose model also transfers very well on MPII benchmark, achieving superior performance on the test set when fine-tuned with small training costs. Code and pre-trained models are publicly available\footnote{\url{https://github.com/yangsenius/TransPose}}.

Results

Task	Dataset	Metric	Value	Model
Pose Estimation	OCHuman	Validation AP	62.3	TransPose-H
Pose Estimation	COCO test-dev	AP	75	TransPose-H-A6
Pose Estimation	COCO test-dev	AP50	92.2	TransPose-H-A6
Pose Estimation	COCO test-dev	AP75	82.3	TransPose-H-A6
Pose Estimation	COCO test-dev	APL	81.1	TransPose-H-A6
Pose Estimation	COCO test-dev	APM	71.3	TransPose-H-A6
Pose Estimation	MPII Human Pose	PCKh-0.5	93.5	TransPose
Pose Estimation	COCO (Common Objects in Context)	Test AP	75	TransPose(256x192)
Pose Estimation	COCO (Common Objects in Context)	Validation AP	75.8	TransPose(256x192)
Pose Estimation	CrowdPose	AP Easy	79.5	TransPose-H
Pose Estimation	CrowdPose	AP Hard	62.2	TransPose-H
Pose Estimation	CrowdPose	AP Medium	72.9	TransPose-H
Pose Estimation	CrowdPose	mAP @0.5:0.95	71.8	TransPose-H
Pose Estimation	OCHuman	AP50	82.7	TransPose-H
Pose Estimation	OCHuman	AP75	67.1	TransPose-H
3D	OCHuman	Validation AP	62.3	TransPose-H
3D	COCO test-dev	AP	75	TransPose-H-A6
3D	COCO test-dev	AP50	92.2	TransPose-H-A6
3D	COCO test-dev	AP75	82.3	TransPose-H-A6
3D	COCO test-dev	APL	81.1	TransPose-H-A6
3D	COCO test-dev	APM	71.3	TransPose-H-A6
3D	MPII Human Pose	PCKh-0.5	93.5	TransPose
3D	COCO (Common Objects in Context)	Test AP	75	TransPose(256x192)
3D	COCO (Common Objects in Context)	Validation AP	75.8	TransPose(256x192)
3D	CrowdPose	AP Easy	79.5	TransPose-H
3D	CrowdPose	AP Hard	62.2	TransPose-H
3D	CrowdPose	AP Medium	72.9	TransPose-H
3D	CrowdPose	mAP @0.5:0.95	71.8	TransPose-H
3D	OCHuman	AP50	82.7	TransPose-H
3D	OCHuman	AP75	67.1	TransPose-H
Multi-Person Pose Estimation	CrowdPose	AP Easy	79.5	TransPose-H
Multi-Person Pose Estimation	CrowdPose	AP Hard	62.2	TransPose-H
Multi-Person Pose Estimation	CrowdPose	AP Medium	72.9	TransPose-H
Multi-Person Pose Estimation	CrowdPose	mAP @0.5:0.95	71.8	TransPose-H
Multi-Person Pose Estimation	OCHuman	AP50	82.7	TransPose-H
Multi-Person Pose Estimation	OCHuman	AP75	67.1	TransPose-H
1 Image, 2*2 Stitchi	OCHuman	Validation AP	62.3	TransPose-H
1 Image, 2*2 Stitchi	COCO test-dev	AP	75	TransPose-H-A6
1 Image, 2*2 Stitchi	COCO test-dev	AP50	92.2	TransPose-H-A6
1 Image, 2*2 Stitchi	COCO test-dev	AP75	82.3	TransPose-H-A6
1 Image, 2*2 Stitchi	COCO test-dev	APL	81.1	TransPose-H-A6
1 Image, 2*2 Stitchi	COCO test-dev	APM	71.3	TransPose-H-A6
1 Image, 2*2 Stitchi	MPII Human Pose	PCKh-0.5	93.5	TransPose
1 Image, 2*2 Stitchi	COCO (Common Objects in Context)	Test AP	75	TransPose(256x192)
1 Image, 2*2 Stitchi	COCO (Common Objects in Context)	Validation AP	75.8	TransPose(256x192)
1 Image, 2*2 Stitchi	CrowdPose	AP Easy	79.5	TransPose-H
1 Image, 2*2 Stitchi	CrowdPose	AP Hard	62.2	TransPose-H
1 Image, 2*2 Stitchi	CrowdPose	AP Medium	72.9	TransPose-H
1 Image, 2*2 Stitchi	CrowdPose	mAP @0.5:0.95	71.8	TransPose-H
1 Image, 2*2 Stitchi	OCHuman	AP50	82.7	TransPose-H
1 Image, 2*2 Stitchi	OCHuman	AP75	67.1	TransPose-H

TransPose: Keypoint Localization via Transformer

Abstract

Results

Related Papers

TransPose: Keypoint Localization via Transformer

Abstract

Results

Related Papers