Yao Feng, Vasileios Choutas, Timo Bolkart, Dimitrios Tzionas, Michael J. Black
Recovering expressive humans from images is essential for understanding human behavior. Methods that estimate 3D bodies, faces, or hands have progressed significantly, yet separately. Face methods recover accurate 3D shape and geometric details, but need a tight crop and struggle with extreme views and low resolution. Whole-body methods are robust to a wide range of poses and resolutions, but provide only a rough 3D face shape without details like wrinkles. To get the best of both worlds, we introduce PIXIE, which produces animatable, whole-body 3D avatars with realistic facial detail, from a single image. For this, PIXIE uses two key observations. First, existing work combines independent estimates from body, face, and hand experts, by trusting them equally. PIXIE introduces a novel moderator that merges the features of the experts, weighted by their confidence. All part experts can contribute to the whole, using SMPL-X's shared shape space across all body parts. Second, human shape is highly correlated with gender, but existing work ignores this. We label training images as male, female, or non-binary, and train PIXIE to infer "gendered" 3D body shapes with a novel shape loss. In addition to 3D body pose and shape parameters, PIXIE estimates expression, illumination, albedo and 3D facial surface displacements. Quantitative and qualitative evaluation shows that PIXIE estimates more accurate whole-body shape and detailed face shape than the state of the art. Models and code are available at https://pixie.is.tue.mpg.de.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Reconstruction | Expressive hands and faces dataset (EHF) | MPJPE, left hand | 11.7 | PIXIE |
| Reconstruction | Expressive hands and faces dataset (EHF) | MPJPE-14 | 61.5 | PIXIE |
| Reconstruction | Expressive hands and faces dataset (EHF) | PA V2V (mm), body only | 53 | PIXIE |
| Reconstruction | Expressive hands and faces dataset (EHF) | PA V2V (mm), face | 4.6 | PIXIE |
| Reconstruction | Expressive hands and faces dataset (EHF) | PA V2V (mm), left hand | 11.2 | PIXIE |
| Reconstruction | Expressive hands and faces dataset (EHF) | PA V2V (mm), whole body | 55 | PIXIE |
| Reconstruction | Expressive hands and faces dataset (EHF) | TR V2V (mm), body only | 75.8 | PIXIE |
| Reconstruction | Expressive hands and faces dataset (EHF) | TR V2V (mm), face | 14.2 | PIXIE |
| Reconstruction | Expressive hands and faces dataset (EHF) | TR V2V (mm), left hand | 25.6 | PIXIE |
| Reconstruction | Expressive hands and faces dataset (EHF) | TR V2V (mm), whole body | 67.6 | PIXIE |
| Reconstruction | Expressive hands and faces dataset (EHF) | mean P2S | 29.9 | PIXIE |
| Reconstruction | Expressive hands and faces dataset (EHF) | median P2S | 18.4 | PIXIE |
| Reconstruction | AGORA | B-MPJPE | 140.3 | PIXIE |
| Reconstruction | AGORA | B-MVE | 142.2 | PIXIE |
| Reconstruction | AGORA | B-NMJE | 171.1 | PIXIE |
| Reconstruction | AGORA | B-NMVE | 173.4 | PIXIE |
| Reconstruction | AGORA | F-MPJPE | 54.5 | PIXIE |
| Reconstruction | AGORA | F-MVE | 50.2 | PIXIE |
| Reconstruction | AGORA | FB-MPJPE | 189.3 | PIXIE |
| Reconstruction | AGORA | FB-MVE | 191.8 | PIXIE |
| Reconstruction | AGORA | FB-NMJE | 230.9 | PIXIE |
| Reconstruction | AGORA | FB-NMVE | 233.9 | PIXIE |
| Facial Recognition and Modelling | NoW Benchmark | Mean Reconstruction Error (mm) | 1.49 | PIXIE |
| Facial Recognition and Modelling | NoW Benchmark | Median Reconstruction Error | 1.18 | PIXIE |
| Facial Recognition and Modelling | NoW Benchmark | Stdev Reconstruction Error (mm) | 1.25 | PIXIE |
| 3D Human Pose Estimation | AGORA | B-MPJPE | 140.3 | PIXIE |
| 3D Human Pose Estimation | AGORA | B-MVE | 142.2 | PIXIE |
| 3D Human Pose Estimation | AGORA | B-NMJE | 171.1 | PIXIE |
| 3D Human Pose Estimation | AGORA | B-NMVE | 173.4 | PIXIE |
| 3D Human Pose Estimation | AGORA | F-MPJPE | 54.5 | PIXIE |
| 3D Human Pose Estimation | AGORA | F-MVE | 50.2 | PIXIE |
| 3D Human Pose Estimation | AGORA | FB-MPJPE | 189.3 | PIXIE |
| 3D Human Pose Estimation | AGORA | FB-MVE | 191.8 | PIXIE |
| 3D Human Pose Estimation | AGORA | FB-NMJE | 230.9 | PIXIE |
| 3D Human Pose Estimation | AGORA | FB-NMVE | 233.9 | PIXIE |
| Hand | FreiHAND | PA-F@15mm | 0.919 | PIXIE hand expert |
| Hand | FreiHAND | PA-F@5mm | 0.468 | PIXIE hand expert |
| Hand | FreiHAND | PA-MPJPE | 12 | PIXIE hand expert |
| Hand | FreiHAND | PA-MPVPE | 12.1 | PIXIE hand expert |
| Pose Estimation | AGORA | B-MPJPE | 140.3 | PIXIE |
| Pose Estimation | AGORA | B-MVE | 142.2 | PIXIE |
| Pose Estimation | AGORA | B-NMJE | 171.1 | PIXIE |
| Pose Estimation | AGORA | B-NMVE | 173.4 | PIXIE |
| Pose Estimation | AGORA | F-MPJPE | 54.5 | PIXIE |
| Pose Estimation | AGORA | F-MVE | 50.2 | PIXIE |
| Pose Estimation | AGORA | FB-MPJPE | 189.3 | PIXIE |
| Pose Estimation | AGORA | FB-MVE | 191.8 | PIXIE |
| Pose Estimation | AGORA | FB-NMJE | 230.9 | PIXIE |
| Pose Estimation | AGORA | FB-NMVE | 233.9 | PIXIE |
| Pose Estimation | FreiHAND | PA-F@15mm | 0.919 | PIXIE hand expert |
| Pose Estimation | FreiHAND | PA-F@5mm | 0.468 | PIXIE hand expert |
| Pose Estimation | FreiHAND | PA-MPJPE | 12 | PIXIE hand expert |
| Pose Estimation | FreiHAND | PA-MPVPE | 12.1 | PIXIE hand expert |
| Hand Pose Estimation | FreiHAND | PA-F@15mm | 0.919 | PIXIE hand expert |
| Hand Pose Estimation | FreiHAND | PA-F@5mm | 0.468 | PIXIE hand expert |
| Hand Pose Estimation | FreiHAND | PA-MPJPE | 12 | PIXIE hand expert |
| Hand Pose Estimation | FreiHAND | PA-MPVPE | 12.1 | PIXIE hand expert |
| Face Reconstruction | NoW Benchmark | Mean Reconstruction Error (mm) | 1.49 | PIXIE |
| Face Reconstruction | NoW Benchmark | Median Reconstruction Error | 1.18 | PIXIE |
| Face Reconstruction | NoW Benchmark | Stdev Reconstruction Error (mm) | 1.25 | PIXIE |
| 3D | AGORA | B-MPJPE | 140.3 | PIXIE |
| 3D | AGORA | B-MVE | 142.2 | PIXIE |
| 3D | AGORA | B-NMJE | 171.1 | PIXIE |
| 3D | AGORA | B-NMVE | 173.4 | PIXIE |
| 3D | AGORA | F-MPJPE | 54.5 | PIXIE |
| 3D | AGORA | F-MVE | 50.2 | PIXIE |
| 3D | AGORA | FB-MPJPE | 189.3 | PIXIE |
| 3D | AGORA | FB-MVE | 191.8 | PIXIE |
| 3D | AGORA | FB-NMJE | 230.9 | PIXIE |
| 3D | AGORA | FB-NMVE | 233.9 | PIXIE |
| 3D | FreiHAND | PA-F@15mm | 0.919 | PIXIE hand expert |
| 3D | FreiHAND | PA-F@5mm | 0.468 | PIXIE hand expert |
| 3D | FreiHAND | PA-MPJPE | 12 | PIXIE hand expert |
| 3D | FreiHAND | PA-MPVPE | 12.1 | PIXIE hand expert |
| 3D | NoW Benchmark | Mean Reconstruction Error (mm) | 1.49 | PIXIE |
| 3D | NoW Benchmark | Median Reconstruction Error | 1.18 | PIXIE |
| 3D | NoW Benchmark | Stdev Reconstruction Error (mm) | 1.25 | PIXIE |
| 3D Multi-Person Pose Estimation | AGORA | B-MPJPE | 140.3 | PIXIE |
| 3D Multi-Person Pose Estimation | AGORA | B-MVE | 142.2 | PIXIE |
| 3D Multi-Person Pose Estimation | AGORA | B-NMJE | 171.1 | PIXIE |
| 3D Multi-Person Pose Estimation | AGORA | B-NMVE | 173.4 | PIXIE |
| 3D Multi-Person Pose Estimation | AGORA | F-MPJPE | 54.5 | PIXIE |
| 3D Multi-Person Pose Estimation | AGORA | F-MVE | 50.2 | PIXIE |
| 3D Multi-Person Pose Estimation | AGORA | FB-MPJPE | 189.3 | PIXIE |
| 3D Multi-Person Pose Estimation | AGORA | FB-MVE | 191.8 | PIXIE |
| 3D Multi-Person Pose Estimation | AGORA | FB-NMJE | 230.9 | PIXIE |
| 3D Multi-Person Pose Estimation | AGORA | FB-NMVE | 233.9 | PIXIE |
| 3D Face Modelling | NoW Benchmark | Mean Reconstruction Error (mm) | 1.49 | PIXIE |
| 3D Face Modelling | NoW Benchmark | Median Reconstruction Error | 1.18 | PIXIE |
| 3D Face Modelling | NoW Benchmark | Stdev Reconstruction Error (mm) | 1.25 | PIXIE |
| 3D Face Reconstruction | NoW Benchmark | Mean Reconstruction Error (mm) | 1.49 | PIXIE |
| 3D Face Reconstruction | NoW Benchmark | Median Reconstruction Error | 1.18 | PIXIE |
| 3D Face Reconstruction | NoW Benchmark | Stdev Reconstruction Error (mm) | 1.25 | PIXIE |
| 3D Hand Pose Estimation | FreiHAND | PA-F@15mm | 0.919 | PIXIE hand expert |
| 3D Hand Pose Estimation | FreiHAND | PA-F@5mm | 0.468 | PIXIE hand expert |
| 3D Hand Pose Estimation | FreiHAND | PA-MPJPE | 12 | PIXIE hand expert |
| 3D Hand Pose Estimation | FreiHAND | PA-MPVPE | 12.1 | PIXIE hand expert |
| 1 Image, 2*2 Stitchi | AGORA | B-MPJPE | 140.3 | PIXIE |
| 1 Image, 2*2 Stitchi | AGORA | B-MVE | 142.2 | PIXIE |
| 1 Image, 2*2 Stitchi | AGORA | B-NMJE | 171.1 | PIXIE |
| 1 Image, 2*2 Stitchi | AGORA | B-NMVE | 173.4 | PIXIE |
| 1 Image, 2*2 Stitchi | AGORA | F-MPJPE | 54.5 | PIXIE |
| 1 Image, 2*2 Stitchi | AGORA | F-MVE | 50.2 | PIXIE |
| 1 Image, 2*2 Stitchi | AGORA | FB-MPJPE | 189.3 | PIXIE |
| 1 Image, 2*2 Stitchi | AGORA | FB-MVE | 191.8 | PIXIE |
| 1 Image, 2*2 Stitchi | AGORA | FB-NMJE | 230.9 | PIXIE |
| 1 Image, 2*2 Stitchi | AGORA | FB-NMVE | 233.9 | PIXIE |
| 1 Image, 2*2 Stitchi | FreiHAND | PA-F@15mm | 0.919 | PIXIE hand expert |
| 1 Image, 2*2 Stitchi | FreiHAND | PA-F@5mm | 0.468 | PIXIE hand expert |
| 1 Image, 2*2 Stitchi | FreiHAND | PA-MPJPE | 12 | PIXIE hand expert |
| 1 Image, 2*2 Stitchi | FreiHAND | PA-MPVPE | 12.1 | PIXIE hand expert |