InterHuman
ImagesTextsIntroduced 2023-04-12
InterHuman is a multimodal dataset, named InterHuman. It consists of about 107M frames for diverse two-person interactions, with accurate skeletal motions and 16,756 natural language descriptions.
Source: InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions
Image Source: GitHub Repo: InterGen
Benchmarks
10-shot image generation/FID10-shot image generation/R-Precision Top310-shot image generation/MMDist10-shot image generation/MModality3D Human Pose Tracking/FID3D Human Pose Tracking/R-Precision Top33D Human Pose Tracking/MMDist3D Human Pose Tracking/MModalityMotion Synthesis/FIDMotion Synthesis/R-Precision Top3Motion Synthesis/MMDistMotion Synthesis/MModalityPose Tracking/FIDPose Tracking/R-Precision Top3Pose Tracking/MMDistPose Tracking/MModality