Hsin-Ying Lee, Xiaodong Yang, Ming-Yu Liu, Ting-Chun Wang, Yu-Ding Lu, Ming-Hsuan Yang, Jan Kautz
Dancing to music is an instinctive move by humans. Learning to model the music-to-dance generation process is, however, a challenging problem. It requires significant efforts to measure the correlation between music and dance as one needs to simultaneously consider multiple aspects, such as style and beat of both music and dance. Additionally, dance is inherently multimodal and various following movements of a pose at any moment are equally likely. In this paper, we propose a synthesis-by-analysis learning framework to generate dance from music. In the analysis phase, we decompose a dance into a series of basic dance units, through which the model learns how to move. In the synthesis phase, the model learns how to compose a dance by organizing multiple basic dancing movements seamlessly according to the input music. Experimental qualitative and quantitative results demonstrate that the proposed method can synthesize realistic, diverse,style-consistent, and beat-matching dances from music.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Pose Tracking | BRACE | Beat DTW cost | 11.6 | Dancing 2 Music |
| Pose Tracking | BRACE | Beat alignment score | 0.129 | Dancing 2 Music |
| Pose Tracking | BRACE | Footwork average | 50.09 | Dancing 2 Music |
| Pose Tracking | BRACE | Frechet Inception Distance | 0.5884 | Dancing 2 Music |
| Pose Tracking | BRACE | Powermove average | 33.87 | Dancing 2 Music |
| Pose Tracking | BRACE | Toprock average | 16.04 | Dancing 2 Music |
| Motion Synthesis | BRACE | Beat DTW cost | 11.6 | Dancing 2 Music |
| Motion Synthesis | BRACE | Beat alignment score | 0.129 | Dancing 2 Music |
| Motion Synthesis | BRACE | Footwork average | 50.09 | Dancing 2 Music |
| Motion Synthesis | BRACE | Frechet Inception Distance | 0.5884 | Dancing 2 Music |
| Motion Synthesis | BRACE | Powermove average | 33.87 | Dancing 2 Music |
| Motion Synthesis | BRACE | Toprock average | 16.04 | Dancing 2 Music |
| 10-shot image generation | BRACE | Beat DTW cost | 11.6 | Dancing 2 Music |
| 10-shot image generation | BRACE | Beat alignment score | 0.129 | Dancing 2 Music |
| 10-shot image generation | BRACE | Footwork average | 50.09 | Dancing 2 Music |
| 10-shot image generation | BRACE | Frechet Inception Distance | 0.5884 | Dancing 2 Music |
| 10-shot image generation | BRACE | Powermove average | 33.87 | Dancing 2 Music |
| 10-shot image generation | BRACE | Toprock average | 16.04 | Dancing 2 Music |
| 3D Human Pose Tracking | BRACE | Beat DTW cost | 11.6 | Dancing 2 Music |
| 3D Human Pose Tracking | BRACE | Beat alignment score | 0.129 | Dancing 2 Music |
| 3D Human Pose Tracking | BRACE | Footwork average | 50.09 | Dancing 2 Music |
| 3D Human Pose Tracking | BRACE | Frechet Inception Distance | 0.5884 | Dancing 2 Music |
| 3D Human Pose Tracking | BRACE | Powermove average | 33.87 | Dancing 2 Music |
| 3D Human Pose Tracking | BRACE | Toprock average | 16.04 | Dancing 2 Music |