Ke Fan, Junshu Tang, Weijian Cao, Ran Yi, Moran Li, Jingyu Gong, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Lizhuang Ma
Text-to-motion synthesis is a crucial task in computer vision. Existing methods are limited in their universality, as they are tailored for single-person or two-person scenarios and can not be applied to generate motions for more individuals. To achieve the number-free motion synthesis, this paper reconsiders motion generation and proposes to unify the single and multi-person motion by the conditional motion distribution. Furthermore, a generation module and an interaction module are designed for our FreeMotion framework to decouple the process of conditional motion generation and finally support the number-free motion synthesis. Besides, based on our framework, the current single-person motion spatial control method could be seamlessly integrated, achieving precise control of multi-person motion. Extensive experiments demonstrate the superior performance of our method and our capability to infer single and multi-human motions simultaneously.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Pose Tracking | InterHuman | FID | 6.74 | FreeMotion |
| Pose Tracking | InterHuman | MMDist | 3.848 | FreeMotion |
| Pose Tracking | InterHuman | MModality | 1.226 | FreeMotion |
| Pose Tracking | InterHuman | R-Precision Top3 | 0.544 | FreeMotion |
| Motion Synthesis | InterHuman | FID | 6.74 | FreeMotion |
| Motion Synthesis | InterHuman | MMDist | 3.848 | FreeMotion |
| Motion Synthesis | InterHuman | MModality | 1.226 | FreeMotion |
| Motion Synthesis | InterHuman | R-Precision Top3 | 0.544 | FreeMotion |
| 10-shot image generation | InterHuman | FID | 6.74 | FreeMotion |
| 10-shot image generation | InterHuman | MMDist | 3.848 | FreeMotion |
| 10-shot image generation | InterHuman | MModality | 1.226 | FreeMotion |
| 10-shot image generation | InterHuman | R-Precision Top3 | 0.544 | FreeMotion |
| 3D Human Pose Tracking | InterHuman | FID | 6.74 | FreeMotion |
| 3D Human Pose Tracking | InterHuman | MMDist | 3.848 | FreeMotion |
| 3D Human Pose Tracking | InterHuman | MModality | 1.226 | FreeMotion |
| 3D Human Pose Tracking | InterHuman | R-Precision Top3 | 0.544 | FreeMotion |