CMU Panoptic is a large scale dataset providing 3D pose annotations (1.5 millions) for multiple people engaging social activities. It contains 65 videos (5.5 hours) with multi-view annotations, but only 17 of them are in multi-person scenario and have the camera parameters.
Massively Multiview System
- 480 VGA camera views
- 30+ HD views
- 10 RGB-D sensors
- Hardware-based sync
- Calibration
- Interesting Scenes with Labels
Multiple people
- Socially interacting groups
- 3D body pose
- 3D facial landmarks
- Transcripts + speaker ID
Hardware setup
- 480 VGA cameras, 640 x 480 resolution, 25 fps, synchronized among themselves using a hardware clock
- 31 HD cameras, 1920 x 1080 resolution, 30 fps, synchronized among themselves using a hardware clock, timing aligned with VGA cameras
- 10 Kinect Ⅱ Sensors. 1920 x 1080 (RGB), 512 x 424 (depth), 30 fps, timing aligned among themselves and other sensors
5 DLP Projectors. synchronized with HD cameras
Source: Single-Stage Multi-Person Pose Machines
Image Source: http://domedb.perception.cs.cmu.edu/