TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/RemoCap: Disentangled Representation Learning for Motion C...

RemoCap: Disentangled Representation Learning for Motion Capture

Hongsheng Wang, Lizao Zhang, Zhangnan Zhong, Shuolin Xu, Xinrui Zhou, Shengyu Zhang, Huahao Xu, Fei Wu, Feng Lin

2024-05-213D Human Pose EstimationRepresentation LearningDisentanglementMotion Disentanglement
PaperPDF

Abstract

Reconstructing 3D human bodies from realistic motion sequences remains a challenge due to pervasive and complex occlusions. Current methods struggle to capture the dynamics of occluded body parts, leading to model penetration and distorted motion. RemoCap leverages Spatial Disentanglement (SD) and Motion Disentanglement (MD) to overcome these limitations. SD addresses occlusion interference between the target human body and surrounding objects. It achieves this by disentangling target features along the dimension axis. By aligning features based on their spatial positions in each dimension, SD isolates the target object's response within a global window, enabling accurate capture despite occlusions. The MD module employs a channel-wise temporal shuffling strategy to simulate diverse scene dynamics. This process effectively disentangles motion features, allowing RemoCap to reconstruct occluded parts with greater fidelity. Furthermore, this paper introduces a sequence velocity loss that promotes temporal coherence. This loss constrains inter-frame velocity errors, ensuring the predicted motion exhibits realistic consistency. Extensive comparisons with state-of-the-art (SOTA) methods on benchmark datasets demonstrate RemoCap's superior performance in 3D human body reconstruction. On the 3DPW dataset, RemoCap surpasses all competitors, achieving the best results in MPVPE (81.9), MPJPE (72.7), and PA-MPJPE (44.1) metrics. Codes are available at https://wanghongsheng01.github.io/RemoCap/.

Results

TaskDatasetMetricValueModel
3D Human Pose Estimation3DPWMPJPE72.7RemoCap
3D Human Pose Estimation3DPWMPVPE81.9RemoCap
3D Human Pose Estimation3DPWPA-MPJPE44.1RemoCap
Pose Estimation3DPWMPJPE72.7RemoCap
Pose Estimation3DPWMPVPE81.9RemoCap
Pose Estimation3DPWPA-MPJPE44.1RemoCap
3D3DPWMPJPE72.7RemoCap
3D3DPWMPVPE81.9RemoCap
3D3DPWPA-MPJPE44.1RemoCap
1 Image, 2*2 Stitchi3DPWMPJPE72.7RemoCap
1 Image, 2*2 Stitchi3DPWMPVPE81.9RemoCap
1 Image, 2*2 Stitchi3DPWPA-MPJPE44.1RemoCap

Related Papers

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models2025-07-18Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Boosting Team Modeling through Tempo-Relational Representation Learning2025-07-17Similarity-Guided Diffusion for Contrastive Sequential Recommendation2025-07-16Are encoders able to learn landmarkers for warm-starting of Hyperparameter Optimization?2025-07-16Language-Guided Contrastive Audio-Visual Masked Autoencoder with Automatically Generated Audio-Visual-Text Triplets from Videos2025-07-16A Mixed-Primitive-based Gaussian Splatting Method for Surface Reconstruction2025-07-15