Space-Time-Separable Graph Convolutional Network for Pose Forecasting

Theodoros Sofianos, Alessio Sampieri, Luca Franco, Fabio Galasso

2021-10-09ICCV 2021 10Human Pose Forecasting Time Series Time Series Analysis STS

Abstract

Human pose forecasting is a complex structured-data sequence-modelling task, which has received increasing attention, also due to numerous potential applications. Research has mainly addressed the temporal dimension as time series and the interaction of human body joints with a kinematic tree or by a graph. This has decoupled the two aspects and leveraged progress from the relevant fields, but it has also limited the understanding of the complex structural joint spatio-temporal dynamics of the human pose. Here we propose a novel Space-Time-Separable Graph Convolutional Network (STS-GCN) for pose forecasting. For the first time, STS-GCN models the human pose dynamics only with a graph convolutional network (GCN), including the temporal evolution and the spatial joint interaction within a single-graph framework, which allows the cross-talk of motion and spatial correlations. Concurrently, STS-GCN is the first space-time-separable GCN: the space-time graph connectivity is factored into space and time affinity matrices, which bottlenecks the space-time cross-talk, while enabling full joint-joint and time-time correlations. Both affinity matrices are learnt end-to-end, which results in connections substantially deviating from the standard kinematic tree and the linear-time time series. In experimental evaluation on three complex, recent and large-scale benchmarks, Human3.6M [Ionescu et al. TPAMI'14], AMASS [Mahmood et al. ICCV'19] and 3DPW [Von Marcard et al. ECCV'18], STS-GCN outperforms the state-of-the-art, surpassing the current best technique [Mao et al. ECCV'20] by over 32% in average at the most difficult long-term predictions, while only requiring 1.7% of its parameters. We explain the results qualitatively and illustrate the graph interactions by the factored joint-joint and time-time learnt graph connections. Our source code is available at: https://github.com/FraLuca/STSGCN

Results

Task	Dataset	Metric	Value	Model
Pose Estimation	HARPER	Average MPJPE (mm) @ 1000ms	171	STS-GCN
Pose Estimation	HARPER	Average MPJPE (mm) @ 400ms	120	STS-GCN
Pose Estimation	HARPER	Last Frame MPJPE (mm) @ 1000ms	260	STS-GCN
Pose Estimation	HARPER	Last Frame MPJPE (mm) @ 400ms	147	STS-GCN
Pose Estimation	AMASS	Average MPJPE (mm) 1000 msec	45.5	STS-GCN
Pose Estimation	Human3.6M	Average MPJPE (mm) @ 1000 ms	117	STS-GCN
Pose Estimation	Human3.6M	Average MPJPE (mm) @ 400ms	65.8	STS-GCN
Pose Estimation	Human3.6M	MAR, walking, 1,000ms	0.87	STS-GCN
Pose Estimation	Human3.6M	MAR, walking, 400ms	0.55	STS-GCN
Pose Estimation	3DPW	Average MPJPE (mm) 1000 msec	42.3	STS-GCN
3D	HARPER	Average MPJPE (mm) @ 1000ms	171	STS-GCN
3D	HARPER	Average MPJPE (mm) @ 400ms	120	STS-GCN
3D	HARPER	Last Frame MPJPE (mm) @ 1000ms	260	STS-GCN
3D	HARPER	Last Frame MPJPE (mm) @ 400ms	147	STS-GCN
3D	AMASS	Average MPJPE (mm) 1000 msec	45.5	STS-GCN
3D	Human3.6M	Average MPJPE (mm) @ 1000 ms	117	STS-GCN
3D	Human3.6M	Average MPJPE (mm) @ 400ms	65.8	STS-GCN
3D	Human3.6M	MAR, walking, 1,000ms	0.87	STS-GCN
3D	Human3.6M	MAR, walking, 400ms	0.55	STS-GCN
3D	3DPW	Average MPJPE (mm) 1000 msec	42.3	STS-GCN
1 Image, 2*2 Stitchi	HARPER	Average MPJPE (mm) @ 1000ms	171	STS-GCN
1 Image, 2*2 Stitchi	HARPER	Average MPJPE (mm) @ 400ms	120	STS-GCN
1 Image, 2*2 Stitchi	HARPER	Last Frame MPJPE (mm) @ 1000ms	260	STS-GCN
1 Image, 2*2 Stitchi	HARPER	Last Frame MPJPE (mm) @ 400ms	147	STS-GCN
1 Image, 2*2 Stitchi	AMASS	Average MPJPE (mm) 1000 msec	45.5	STS-GCN
1 Image, 2*2 Stitchi	Human3.6M	Average MPJPE (mm) @ 1000 ms	117	STS-GCN
1 Image, 2*2 Stitchi	Human3.6M	Average MPJPE (mm) @ 400ms	65.8	STS-GCN
1 Image, 2*2 Stitchi	Human3.6M	MAR, walking, 1,000ms	0.87	STS-GCN
1 Image, 2*2 Stitchi	Human3.6M	MAR, walking, 400ms	0.55	STS-GCN
1 Image, 2*2 Stitchi	3DPW	Average MPJPE (mm) 1000 msec	42.3	STS-GCN

Abstract

Results

Task	Dataset	Metric	Value	Model
Pose Estimation	HARPER	Average MPJPE (mm) @ 1000ms	171	STS-GCN
Pose Estimation	HARPER	Average MPJPE (mm) @ 400ms	120	STS-GCN
Pose Estimation	HARPER	Last Frame MPJPE (mm) @ 1000ms	260	STS-GCN
Pose Estimation	HARPER	Last Frame MPJPE (mm) @ 400ms	147	STS-GCN
Pose Estimation	AMASS	Average MPJPE (mm) 1000 msec	45.5	STS-GCN
Pose Estimation	Human3.6M	Average MPJPE (mm) @ 1000 ms	117	STS-GCN
Pose Estimation	Human3.6M	Average MPJPE (mm) @ 400ms	65.8	STS-GCN
Pose Estimation	Human3.6M	MAR, walking, 1,000ms	0.87	STS-GCN
Pose Estimation	Human3.6M	MAR, walking, 400ms	0.55	STS-GCN
Pose Estimation	3DPW	Average MPJPE (mm) 1000 msec	42.3	STS-GCN
3D	HARPER	Average MPJPE (mm) @ 1000ms	171	STS-GCN
3D	HARPER	Average MPJPE (mm) @ 400ms	120	STS-GCN
3D	HARPER	Last Frame MPJPE (mm) @ 1000ms	260	STS-GCN
3D	HARPER	Last Frame MPJPE (mm) @ 400ms	147	STS-GCN
3D	AMASS	Average MPJPE (mm) 1000 msec	45.5	STS-GCN
3D	Human3.6M	Average MPJPE (mm) @ 1000 ms	117	STS-GCN
3D	Human3.6M	Average MPJPE (mm) @ 400ms	65.8	STS-GCN
3D	Human3.6M	MAR, walking, 1,000ms	0.87	STS-GCN
3D	Human3.6M	MAR, walking, 400ms	0.55	STS-GCN
3D	3DPW	Average MPJPE (mm) 1000 msec	42.3	STS-GCN
1 Image, 2*2 Stitchi	HARPER	Average MPJPE (mm) @ 1000ms	171	STS-GCN
1 Image, 2*2 Stitchi	HARPER	Average MPJPE (mm) @ 400ms	120	STS-GCN
1 Image, 2*2 Stitchi	HARPER	Last Frame MPJPE (mm) @ 1000ms	260	STS-GCN
1 Image, 2*2 Stitchi	HARPER	Last Frame MPJPE (mm) @ 400ms	147	STS-GCN
1 Image, 2*2 Stitchi	AMASS	Average MPJPE (mm) 1000 msec	45.5	STS-GCN
1 Image, 2*2 Stitchi	Human3.6M	Average MPJPE (mm) @ 1000 ms	117	STS-GCN
1 Image, 2*2 Stitchi	Human3.6M	Average MPJPE (mm) @ 400ms	65.8	STS-GCN
1 Image, 2*2 Stitchi	Human3.6M	MAR, walking, 1,000ms	0.87	STS-GCN
1 Image, 2*2 Stitchi	Human3.6M	MAR, walking, 400ms	0.55	STS-GCN
1 Image, 2*2 Stitchi	3DPW	Average MPJPE (mm) 1000 msec	42.3	STS-GCN

Space-Time-Separable Graph Convolutional Network for Pose Forecasting

Abstract

Results

Related Papers

Space-Time-Separable Graph Convolutional Network for Pose Forecasting

Abstract

Results

Related Papers