ARTS: Semi-Analytical Regressor using Disentangled Skeletal Representations for Human Mesh Recovery from Videos

Tao Tang, Hong Liu, Yingxuan You, Ti Wang, Wenhao Li

2024-10-213D Human Pose Estimation Disentanglement Pose Estimation Human Mesh Recovery

Abstract

Although existing video-based 3D human mesh recovery methods have made significant progress, simultaneously estimating human pose and shape from low-resolution image features limits their performance. These image features lack sufficient spatial information about the human body and contain various noises (e.g., background, lighting, and clothing), which often results in inaccurate pose and inconsistent motion. Inspired by the rapid advance in human pose estimation, we discover that compared to image features, skeletons inherently contain accurate human pose and motion. Therefore, we propose a novel semiAnalytical Regressor using disenTangled Skeletal representations for human mesh recovery from videos, called ARTS. Specifically, a skeleton estimation and disentanglement module is proposed to estimate the 3D skeletons from a video and decouple them into disentangled skeletal representations (i.e., joint position, bone length, and human motion). Then, to fully utilize these representations, we introduce a semi-analytical regressor to estimate the parameters of the human mesh model. The regressor consists of three modules: Temporal Inverse Kinematics (TIK), Bone-guided Shape Fitting (BSF), and Motion-Centric Refinement (MCR). TIK utilizes joint position to estimate initial pose parameters and BSF leverages bone length to regress bone-aligned shape parameters. Finally, MCR combines human motion representation with image features to refine the initial human model parameters. Extensive experiments demonstrate that our ARTS surpasses existing state-of-the-art video-based methods in both per-frame accuracy and temporal consistency on popular benchmarks: 3DPW, MPI-INF-3DHP, and Human3.6M. Code is available at https://github.com/TangTao-PKU/ARTS.

Results

Task	Dataset	Metric	Value	Model
3D Human Pose Estimation	MPI-INF-3DHP	Acceleration Error	7.4	ARTS (Resnet50 L=16)
3D Human Pose Estimation	MPI-INF-3DHP	MPJPE	71.8	ARTS (Resnet50 L=16)
3D Human Pose Estimation	MPI-INF-3DHP	PA-MPJPE	53	ARTS (Resnet50 L=16)
3D Human Pose Estimation	3DPW	Acceleration Error	6.5	ARTS (Resnet50 L=16)
3D Human Pose Estimation	3DPW	MPJPE	67.7	ARTS (Resnet50 L=16)
3D Human Pose Estimation	3DPW	MPVPE	81.4	ARTS (Resnet50 L=16)
3D Human Pose Estimation	3DPW	PA-MPJPE	46.5	ARTS (Resnet50 L=16)
Pose Estimation	MPI-INF-3DHP	Acceleration Error	7.4	ARTS (Resnet50 L=16)
Pose Estimation	MPI-INF-3DHP	MPJPE	71.8	ARTS (Resnet50 L=16)
Pose Estimation	MPI-INF-3DHP	PA-MPJPE	53	ARTS (Resnet50 L=16)
Pose Estimation	3DPW	Acceleration Error	6.5	ARTS (Resnet50 L=16)
Pose Estimation	3DPW	MPJPE	67.7	ARTS (Resnet50 L=16)
Pose Estimation	3DPW	MPVPE	81.4	ARTS (Resnet50 L=16)
Pose Estimation	3DPW	PA-MPJPE	46.5	ARTS (Resnet50 L=16)
3D	MPI-INF-3DHP	Acceleration Error	7.4	ARTS (Resnet50 L=16)
3D	MPI-INF-3DHP	MPJPE	71.8	ARTS (Resnet50 L=16)
3D	MPI-INF-3DHP	PA-MPJPE	53	ARTS (Resnet50 L=16)
3D	3DPW	Acceleration Error	6.5	ARTS (Resnet50 L=16)
3D	3DPW	MPJPE	67.7	ARTS (Resnet50 L=16)
3D	3DPW	MPVPE	81.4	ARTS (Resnet50 L=16)
3D	3DPW	PA-MPJPE	46.5	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	Acceleration Error	7.4	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	MPJPE	71.8	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	PA-MPJPE	53	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	3DPW	Acceleration Error	6.5	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	3DPW	MPJPE	67.7	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	3DPW	MPVPE	81.4	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	3DPW	PA-MPJPE	46.5	ARTS (Resnet50 L=16)

Abstract

Results

Task	Dataset	Metric	Value	Model
3D Human Pose Estimation	MPI-INF-3DHP	Acceleration Error	7.4	ARTS (Resnet50 L=16)
3D Human Pose Estimation	MPI-INF-3DHP	MPJPE	71.8	ARTS (Resnet50 L=16)
3D Human Pose Estimation	MPI-INF-3DHP	PA-MPJPE	53	ARTS (Resnet50 L=16)
3D Human Pose Estimation	3DPW	Acceleration Error	6.5	ARTS (Resnet50 L=16)
3D Human Pose Estimation	3DPW	MPJPE	67.7	ARTS (Resnet50 L=16)
3D Human Pose Estimation	3DPW	MPVPE	81.4	ARTS (Resnet50 L=16)
3D Human Pose Estimation	3DPW	PA-MPJPE	46.5	ARTS (Resnet50 L=16)
Pose Estimation	MPI-INF-3DHP	Acceleration Error	7.4	ARTS (Resnet50 L=16)
Pose Estimation	MPI-INF-3DHP	MPJPE	71.8	ARTS (Resnet50 L=16)
Pose Estimation	MPI-INF-3DHP	PA-MPJPE	53	ARTS (Resnet50 L=16)
Pose Estimation	3DPW	Acceleration Error	6.5	ARTS (Resnet50 L=16)
Pose Estimation	3DPW	MPJPE	67.7	ARTS (Resnet50 L=16)
Pose Estimation	3DPW	MPVPE	81.4	ARTS (Resnet50 L=16)
Pose Estimation	3DPW	PA-MPJPE	46.5	ARTS (Resnet50 L=16)
3D	MPI-INF-3DHP	Acceleration Error	7.4	ARTS (Resnet50 L=16)
3D	MPI-INF-3DHP	MPJPE	71.8	ARTS (Resnet50 L=16)
3D	MPI-INF-3DHP	PA-MPJPE	53	ARTS (Resnet50 L=16)
3D	3DPW	Acceleration Error	6.5	ARTS (Resnet50 L=16)
3D	3DPW	MPJPE	67.7	ARTS (Resnet50 L=16)
3D	3DPW	MPVPE	81.4	ARTS (Resnet50 L=16)
3D	3DPW	PA-MPJPE	46.5	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	Acceleration Error	7.4	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	MPJPE	71.8	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	MPI-INF-3DHP	PA-MPJPE	53	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	3DPW	Acceleration Error	6.5	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	3DPW	MPJPE	67.7	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	3DPW	MPVPE	81.4	ARTS (Resnet50 L=16)
1 Image, 2*2 Stitchi	3DPW	PA-MPJPE	46.5	ARTS (Resnet50 L=16)

ARTS: Semi-Analytical Regressor using Disentangled Skeletal Representations for Human Mesh Recovery from Videos

Abstract

Results

Related Papers

ARTS: Semi-Analytical Regressor using Disentangled Skeletal Representations for Human Mesh Recovery from Videos

Abstract

Results

Related Papers