TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Task-Generic Hierarchical Human Motion Prior using VAEs

Task-Generic Hierarchical Human Motion Prior using VAEs

Jiaman Li, Ruben Villegas, Duygu Ceylan, Jimei Yang, Zhengfei Kuang, Hao Li, Yajie Zhao

2021-06-07Pose EstimationMotion Synthesis
PaperPDF

Abstract

A deep generative model that describes human motions can benefit a wide range of fundamental computer vision and graphics tasks, such as providing robustness to video-based human pose estimation, predicting complete body movements for motion capture systems during occlusions, and assisting key frame animation with plausible movements. In this paper, we present a method for learning complex human motions independent of specific tasks using a combined global and local latent space to facilitate coarse and fine-grained modeling. Specifically, we propose a hierarchical motion variational autoencoder (HM-VAE) that consists of a 2-level hierarchical latent space. While the global latent space captures the overall global body motion, the local latent space enables to capture the refined poses of the different body parts. We demonstrate the effectiveness of our hierarchical motion variational autoencoder in a variety of tasks including video-based human pose estimation, motion completion from partial observations, and motion synthesis from sparse key-frames. Even though, our model has not been trained for any of these tasks specifically, it provides superior performance than task-specific alternatives. Our general-purpose human motion prior model can fix corrupted human body animations and generate complete movements from incomplete observations.

Results

TaskDatasetMetricValueModel
Pose TrackingLaFAN1L2Q@150.54HM-VAE
Pose TrackingLaFAN1L2Q@300.94HM-VAE
Pose TrackingLaFAN1L2Q@50.24HM-VAE
Motion SynthesisLaFAN1L2Q@150.54HM-VAE
Motion SynthesisLaFAN1L2Q@300.94HM-VAE
Motion SynthesisLaFAN1L2Q@50.24HM-VAE
10-shot image generationLaFAN1L2Q@150.54HM-VAE
10-shot image generationLaFAN1L2Q@300.94HM-VAE
10-shot image generationLaFAN1L2Q@50.24HM-VAE
3D Human Pose TrackingLaFAN1L2Q@150.54HM-VAE
3D Human Pose TrackingLaFAN1L2Q@300.94HM-VAE
3D Human Pose TrackingLaFAN1L2Q@50.24HM-VAE

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16