Monocular Expressive Body Regression through Body-Driven Attention

Vasileios Choutas, Georgios Pavlakos, Timo Bolkart, Dimitrios Tzionas, Michael J. Black

2020-08-20ECCV 2020 83D Human Pose Estimation 3D Hand Pose Estimation regression 3D Human Reconstruction 3D Face Reconstruction 3D Multi-Person Mesh Recovery

Paper PDF Code(official)

Abstract

To understand how people look, interact, or perform tasks, we need to quickly and accurately capture their 3D body, face, and hands together from an RGB image. Most existing methods focus only on parts of the body. A few recent approaches reconstruct full expressive 3D humans from images using 3D body models that include the face and hands. These methods are optimization-based and thus slow, prone to local optima, and require 2D keypoints as input. We address these limitations by introducing ExPose (EXpressive POse and Shape rEgression), which directly regresses the body, face, and hands, in SMPL-X format, from an RGB image. This is a hard problem due to the high dimensionality of the body and the lack of expressive training data. Additionally, hands and faces are much smaller than the body, occupying very few image pixels. This makes hand and face estimation hard when body images are downscaled for neural networks. We make three main contributions. First, we account for the lack of training data by curating a dataset of SMPL-X fits on in-the-wild images. Second, we observe that body estimation localizes the face and hands reasonably well. We introduce body-driven attention for face and hand regions in the original image to extract higher-resolution crops that are fed to dedicated refinement modules. Third, these modules exploit part-specific knowledge from existing face- and hand-only datasets. ExPose estimates expressive 3D humans more accurately than existing optimization methods at a small fraction of the computational cost. Our data, model and code are available for research at https://expose.is.tue.mpg.de .

Results

Task	Dataset	Metric	Value	Model
Reconstruction	Expressive hands and faces dataset (EHF)	MPJPE, left hand	13.5	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	MPJPE-14	62.8	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	PA V2V (mm), body only	52.6	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	PA V2V (mm), face	5.8	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	PA V2V (mm), left hand	13.1	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	PA V2V (mm), whole body	54.5	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	TR V2V (mm), body only	76.8	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	TR V2V (mm), face	15.9	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	TR V2V (mm), left hand	31.2	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	TR V2V (mm), whole body	65.7	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	mean P2S	28.9	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	median P2S	18	ExPose
Reconstruction	AGORA	B-MPJPE	150.4	ExPose
Reconstruction	AGORA	B-MVE	151.5	ExPose
Reconstruction	AGORA	B-NMJE	183.4	ExPose
Reconstruction	AGORA	B-NMVE	184.8	ExPose
Reconstruction	AGORA	F-MPJPE	55.2	ExPose
Reconstruction	AGORA	F-MVE	51.1	ExPose
Reconstruction	AGORA	FB-MPJPE	215.9	ExPose
Reconstruction	AGORA	FB-MVE	217.3	ExPose
Reconstruction	AGORA	FB-NMJE	263.3	ExPose
Reconstruction	AGORA	FB-NMVE	265	ExPose
Reconstruction	Expressive hands and faces dataset (EHF).	All	54.5	PA-V2V (mm)
3D Human Pose Estimation	3DPW	MPJPE	93.4	ExPose
3D Human Pose Estimation	3DPW	PA-MPJPE	60.7	ExPose
3D Human Pose Estimation	AGORA	B-MPJPE	150.4	ExPose
3D Human Pose Estimation	AGORA	B-MVE	151.5	ExPose
3D Human Pose Estimation	AGORA	B-NMJE	183.4	ExPose
3D Human Pose Estimation	AGORA	B-NMVE	184.8	ExPose
3D Human Pose Estimation	AGORA	F-MPJPE	55.2	ExPose
3D Human Pose Estimation	AGORA	F-MVE	51.1	ExPose
3D Human Pose Estimation	AGORA	FB-MPJPE	215.9	ExPose
3D Human Pose Estimation	AGORA	FB-MVE	217.3	ExPose
3D Human Pose Estimation	AGORA	FB-NMJE	263.3	ExPose
3D Human Pose Estimation	AGORA	FB-NMVE	265	ExPose
Hand	FreiHAND	PA-F@15mm	0.918	ExPose (hand sub-network h)
Hand	FreiHAND	PA-F@5mm	0.484	ExPose (hand sub-network h)
Hand	FreiHAND	PA-MPJPE	12.2	ExPose (hand sub-network h)
Hand	FreiHAND	PA-MPVPE	11.8	ExPose (hand sub-network h)
Pose Estimation	3DPW	MPJPE	93.4	ExPose
Pose Estimation	3DPW	PA-MPJPE	60.7	ExPose
Pose Estimation	AGORA	B-MPJPE	150.4	ExPose
Pose Estimation	AGORA	B-MVE	151.5	ExPose
Pose Estimation	AGORA	B-NMJE	183.4	ExPose
Pose Estimation	AGORA	B-NMVE	184.8	ExPose
Pose Estimation	AGORA	F-MPJPE	55.2	ExPose
Pose Estimation	AGORA	F-MVE	51.1	ExPose
Pose Estimation	AGORA	FB-MPJPE	215.9	ExPose
Pose Estimation	AGORA	FB-MVE	217.3	ExPose
Pose Estimation	AGORA	FB-NMJE	263.3	ExPose
Pose Estimation	AGORA	FB-NMVE	265	ExPose
Pose Estimation	FreiHAND	PA-F@15mm	0.918	ExPose (hand sub-network h)
Pose Estimation	FreiHAND	PA-F@5mm	0.484	ExPose (hand sub-network h)
Pose Estimation	FreiHAND	PA-MPJPE	12.2	ExPose (hand sub-network h)
Pose Estimation	FreiHAND	PA-MPVPE	11.8	ExPose (hand sub-network h)
Hand Pose Estimation	FreiHAND	PA-F@15mm	0.918	ExPose (hand sub-network h)
Hand Pose Estimation	FreiHAND	PA-F@5mm	0.484	ExPose (hand sub-network h)
Hand Pose Estimation	FreiHAND	PA-MPJPE	12.2	ExPose (hand sub-network h)
Hand Pose Estimation	FreiHAND	PA-MPVPE	11.8	ExPose (hand sub-network h)
3D	3DPW	MPJPE	93.4	ExPose
3D	3DPW	PA-MPJPE	60.7	ExPose
3D	AGORA	B-MPJPE	150.4	ExPose
3D	AGORA	B-MVE	151.5	ExPose
3D	AGORA	B-NMJE	183.4	ExPose
3D	AGORA	B-NMVE	184.8	ExPose
3D	AGORA	F-MPJPE	55.2	ExPose
3D	AGORA	F-MVE	51.1	ExPose
3D	AGORA	FB-MPJPE	215.9	ExPose
3D	AGORA	FB-MVE	217.3	ExPose
3D	AGORA	FB-NMJE	263.3	ExPose
3D	AGORA	FB-NMVE	265	ExPose
3D	FreiHAND	PA-F@15mm	0.918	ExPose (hand sub-network h)
3D	FreiHAND	PA-F@5mm	0.484	ExPose (hand sub-network h)
3D	FreiHAND	PA-MPJPE	12.2	ExPose (hand sub-network h)
3D	FreiHAND	PA-MPVPE	11.8	ExPose (hand sub-network h)
3D Multi-Person Pose Estimation	AGORA	B-MPJPE	150.4	ExPose
3D Multi-Person Pose Estimation	AGORA	B-MVE	151.5	ExPose
3D Multi-Person Pose Estimation	AGORA	B-NMJE	183.4	ExPose
3D Multi-Person Pose Estimation	AGORA	B-NMVE	184.8	ExPose
3D Multi-Person Pose Estimation	AGORA	F-MPJPE	55.2	ExPose
3D Multi-Person Pose Estimation	AGORA	F-MVE	51.1	ExPose
3D Multi-Person Pose Estimation	AGORA	FB-MPJPE	215.9	ExPose
3D Multi-Person Pose Estimation	AGORA	FB-MVE	217.3	ExPose
3D Multi-Person Pose Estimation	AGORA	FB-NMJE	263.3	ExPose
3D Multi-Person Pose Estimation	AGORA	FB-NMVE	265	ExPose
3D Hand Pose Estimation	FreiHAND	PA-F@15mm	0.918	ExPose (hand sub-network h)
3D Hand Pose Estimation	FreiHAND	PA-F@5mm	0.484	ExPose (hand sub-network h)
3D Hand Pose Estimation	FreiHAND	PA-MPJPE	12.2	ExPose (hand sub-network h)
3D Hand Pose Estimation	FreiHAND	PA-MPVPE	11.8	ExPose (hand sub-network h)
1 Image, 2*2 Stitchi	3DPW	MPJPE	93.4	ExPose
1 Image, 2*2 Stitchi	3DPW	PA-MPJPE	60.7	ExPose
1 Image, 2*2 Stitchi	AGORA	B-MPJPE	150.4	ExPose
1 Image, 2*2 Stitchi	AGORA	B-MVE	151.5	ExPose
1 Image, 2*2 Stitchi	AGORA	B-NMJE	183.4	ExPose
1 Image, 2*2 Stitchi	AGORA	B-NMVE	184.8	ExPose
1 Image, 2*2 Stitchi	AGORA	F-MPJPE	55.2	ExPose
1 Image, 2*2 Stitchi	AGORA	F-MVE	51.1	ExPose
1 Image, 2*2 Stitchi	AGORA	FB-MPJPE	215.9	ExPose
1 Image, 2*2 Stitchi	AGORA	FB-MVE	217.3	ExPose
1 Image, 2*2 Stitchi	AGORA	FB-NMJE	263.3	ExPose
1 Image, 2*2 Stitchi	AGORA	FB-NMVE	265	ExPose
1 Image, 2*2 Stitchi	FreiHAND	PA-F@15mm	0.918	ExPose (hand sub-network h)
1 Image, 2*2 Stitchi	FreiHAND	PA-F@5mm	0.484	ExPose (hand sub-network h)
1 Image, 2*2 Stitchi	FreiHAND	PA-MPJPE	12.2	ExPose (hand sub-network h)
1 Image, 2*2 Stitchi	FreiHAND	PA-MPVPE	11.8	ExPose (hand sub-network h)

Abstract

Results

Task	Dataset	Metric	Value	Model
Reconstruction	Expressive hands and faces dataset (EHF)	MPJPE, left hand	13.5	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	MPJPE-14	62.8	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	PA V2V (mm), body only	52.6	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	PA V2V (mm), face	5.8	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	PA V2V (mm), left hand	13.1	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	PA V2V (mm), whole body	54.5	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	TR V2V (mm), body only	76.8	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	TR V2V (mm), face	15.9	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	TR V2V (mm), left hand	31.2	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	TR V2V (mm), whole body	65.7	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	mean P2S	28.9	ExPose
Reconstruction	Expressive hands and faces dataset (EHF)	median P2S	18	ExPose
Reconstruction	AGORA	B-MPJPE	150.4	ExPose
Reconstruction	AGORA	B-MVE	151.5	ExPose
Reconstruction	AGORA	B-NMJE	183.4	ExPose
Reconstruction	AGORA	B-NMVE	184.8	ExPose
Reconstruction	AGORA	F-MPJPE	55.2	ExPose
Reconstruction	AGORA	F-MVE	51.1	ExPose
Reconstruction	AGORA	FB-MPJPE	215.9	ExPose
Reconstruction	AGORA	FB-MVE	217.3	ExPose
Reconstruction	AGORA	FB-NMJE	263.3	ExPose
Reconstruction	AGORA	FB-NMVE	265	ExPose
Reconstruction	Expressive hands and faces dataset (EHF).	All	54.5	PA-V2V (mm)
3D Human Pose Estimation	3DPW	MPJPE	93.4	ExPose
3D Human Pose Estimation	3DPW	PA-MPJPE	60.7	ExPose
3D Human Pose Estimation	AGORA	B-MPJPE	150.4	ExPose
3D Human Pose Estimation	AGORA	B-MVE	151.5	ExPose
3D Human Pose Estimation	AGORA	B-NMJE	183.4	ExPose
3D Human Pose Estimation	AGORA	B-NMVE	184.8	ExPose
3D Human Pose Estimation	AGORA	F-MPJPE	55.2	ExPose
3D Human Pose Estimation	AGORA	F-MVE	51.1	ExPose
3D Human Pose Estimation	AGORA	FB-MPJPE	215.9	ExPose
3D Human Pose Estimation	AGORA	FB-MVE	217.3	ExPose
3D Human Pose Estimation	AGORA	FB-NMJE	263.3	ExPose
3D Human Pose Estimation	AGORA	FB-NMVE	265	ExPose
Hand	FreiHAND	PA-F@15mm	0.918	ExPose (hand sub-network h)
Hand	FreiHAND	PA-F@5mm	0.484	ExPose (hand sub-network h)
Hand	FreiHAND	PA-MPJPE	12.2	ExPose (hand sub-network h)
Hand	FreiHAND	PA-MPVPE	11.8	ExPose (hand sub-network h)
Pose Estimation	3DPW	MPJPE	93.4	ExPose
Pose Estimation	3DPW	PA-MPJPE	60.7	ExPose
Pose Estimation	AGORA	B-MPJPE	150.4	ExPose
Pose Estimation	AGORA	B-MVE	151.5	ExPose
Pose Estimation	AGORA	B-NMJE	183.4	ExPose
Pose Estimation	AGORA	B-NMVE	184.8	ExPose
Pose Estimation	AGORA	F-MPJPE	55.2	ExPose
Pose Estimation	AGORA	F-MVE	51.1	ExPose
Pose Estimation	AGORA	FB-MPJPE	215.9	ExPose
Pose Estimation	AGORA	FB-MVE	217.3	ExPose
Pose Estimation	AGORA	FB-NMJE	263.3	ExPose
Pose Estimation	AGORA	FB-NMVE	265	ExPose
Pose Estimation	FreiHAND	PA-F@15mm	0.918	ExPose (hand sub-network h)
Pose Estimation	FreiHAND	PA-F@5mm	0.484	ExPose (hand sub-network h)
Pose Estimation	FreiHAND	PA-MPJPE	12.2	ExPose (hand sub-network h)
Pose Estimation	FreiHAND	PA-MPVPE	11.8	ExPose (hand sub-network h)
Hand Pose Estimation	FreiHAND	PA-F@15mm	0.918	ExPose (hand sub-network h)
Hand Pose Estimation	FreiHAND	PA-F@5mm	0.484	ExPose (hand sub-network h)
Hand Pose Estimation	FreiHAND	PA-MPJPE	12.2	ExPose (hand sub-network h)
Hand Pose Estimation	FreiHAND	PA-MPVPE	11.8	ExPose (hand sub-network h)
3D	3DPW	MPJPE	93.4	ExPose
3D	3DPW	PA-MPJPE	60.7	ExPose
3D	AGORA	B-MPJPE	150.4	ExPose
3D	AGORA	B-MVE	151.5	ExPose
3D	AGORA	B-NMJE	183.4	ExPose
3D	AGORA	B-NMVE	184.8	ExPose
3D	AGORA	F-MPJPE	55.2	ExPose
3D	AGORA	F-MVE	51.1	ExPose
3D	AGORA	FB-MPJPE	215.9	ExPose
3D	AGORA	FB-MVE	217.3	ExPose
3D	AGORA	FB-NMJE	263.3	ExPose
3D	AGORA	FB-NMVE	265	ExPose
3D	FreiHAND	PA-F@15mm	0.918	ExPose (hand sub-network h)
3D	FreiHAND	PA-F@5mm	0.484	ExPose (hand sub-network h)
3D	FreiHAND	PA-MPJPE	12.2	ExPose (hand sub-network h)
3D	FreiHAND	PA-MPVPE	11.8	ExPose (hand sub-network h)
3D Multi-Person Pose Estimation	AGORA	B-MPJPE	150.4	ExPose
3D Multi-Person Pose Estimation	AGORA	B-MVE	151.5	ExPose
3D Multi-Person Pose Estimation	AGORA	B-NMJE	183.4	ExPose
3D Multi-Person Pose Estimation	AGORA	B-NMVE	184.8	ExPose
3D Multi-Person Pose Estimation	AGORA	F-MPJPE	55.2	ExPose
3D Multi-Person Pose Estimation	AGORA	F-MVE	51.1	ExPose
3D Multi-Person Pose Estimation	AGORA	FB-MPJPE	215.9	ExPose
3D Multi-Person Pose Estimation	AGORA	FB-MVE	217.3	ExPose
3D Multi-Person Pose Estimation	AGORA	FB-NMJE	263.3	ExPose
3D Multi-Person Pose Estimation	AGORA	FB-NMVE	265	ExPose
3D Hand Pose Estimation	FreiHAND	PA-F@15mm	0.918	ExPose (hand sub-network h)
3D Hand Pose Estimation	FreiHAND	PA-F@5mm	0.484	ExPose (hand sub-network h)
3D Hand Pose Estimation	FreiHAND	PA-MPJPE	12.2	ExPose (hand sub-network h)
3D Hand Pose Estimation	FreiHAND	PA-MPVPE	11.8	ExPose (hand sub-network h)
1 Image, 2*2 Stitchi	3DPW	MPJPE	93.4	ExPose
1 Image, 2*2 Stitchi	3DPW	PA-MPJPE	60.7	ExPose
1 Image, 2*2 Stitchi	AGORA	B-MPJPE	150.4	ExPose
1 Image, 2*2 Stitchi	AGORA	B-MVE	151.5	ExPose
1 Image, 2*2 Stitchi	AGORA	B-NMJE	183.4	ExPose
1 Image, 2*2 Stitchi	AGORA	B-NMVE	184.8	ExPose
1 Image, 2*2 Stitchi	AGORA	F-MPJPE	55.2	ExPose
1 Image, 2*2 Stitchi	AGORA	F-MVE	51.1	ExPose
1 Image, 2*2 Stitchi	AGORA	FB-MPJPE	215.9	ExPose
1 Image, 2*2 Stitchi	AGORA	FB-MVE	217.3	ExPose
1 Image, 2*2 Stitchi	AGORA	FB-NMJE	263.3	ExPose
1 Image, 2*2 Stitchi	AGORA	FB-NMVE	265	ExPose
1 Image, 2*2 Stitchi	FreiHAND	PA-F@15mm	0.918	ExPose (hand sub-network h)
1 Image, 2*2 Stitchi	FreiHAND	PA-F@5mm	0.484	ExPose (hand sub-network h)
1 Image, 2*2 Stitchi	FreiHAND	PA-MPJPE	12.2	ExPose (hand sub-network h)
1 Image, 2*2 Stitchi	FreiHAND	PA-MPVPE	11.8	ExPose (hand sub-network h)

Monocular Expressive Body Regression through Body-Driven Attention

Abstract

Results

Related Papers

Monocular Expressive Body Regression through Body-Driven Attention

Abstract

Results

Related Papers