Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

Ziniu Wan, Zhengjia Li, Maoqing Tian, Jianbo Liu, Shuai Yi, Hongsheng Li

2021-09-06ICCV 2021 103D Human Pose Estimation 3D Absolute Human Pose Estimation Pose Estimation

Abstract

3D human shape and pose estimation is the essential task for human motion analysis, which is widely used in many 3D applications. However, existing methods cannot simultaneously capture the relations at multiple levels, including spatial-temporal level and human joint level. Therefore they fail to make accurate predictions in some hard scenarios when there is cluttered background, occlusion, or extreme pose. To this end, we propose Multi-level Attention Encoder-Decoder Network (MAED), including a Spatial-Temporal Encoder (STE) and a Kinematic Topology Decoder (KTD) to model multi-level attentions in a unified framework. STE consists of a series of cascaded blocks based on Multi-Head Self-Attention, and each block uses two parallel branches to learn spatial and temporal attention respectively. Meanwhile, KTD aims at modeling the joint level attention. It regards pose estimation as a top-down hierarchical process similar to SMPL kinematic tree. With the training set of 3DPW, MAED outperforms previous state-of-the-art methods by 6.2, 7.2, and 2.4 mm of PA-MPJPE on the three widely used benchmarks 3DPW, MPI-INF-3DHP, and Human3.6M respectively. Our code is available at https://github.com/ziniuwan/maed.

Results

Task	Dataset	Metric	Value	Model
3D Human Pose Estimation	MPI-INF-3DHP	MPJPE	83.6	MAED
3D Human Pose Estimation	MPI-INF-3DHP	PA-MPJPE	56.2	MAED
Pose Estimation	MPI-INF-3DHP	MPJPE	83.6	MAED
Pose Estimation	MPI-INF-3DHP	PA-MPJPE	56.2	MAED
3D	MPI-INF-3DHP	MPJPE	83.6	MAED
3D	MPI-INF-3DHP	PA-MPJPE	56.2	MAED
1 Image, 2*2 Stitchi	MPI-INF-3DHP	MPJPE	83.6	MAED
1 Image, 2*2 Stitchi	MPI-INF-3DHP	PA-MPJPE	56.2	MAED

Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

Abstract

Results

Related Papers

Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

Abstract

Results

Related Papers