TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/PIFu: Pixel-Aligned Implicit Function for High-Resolution ...

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, Hao Li

2019-05-13ICCV 2019 103D Human Pose Estimation3D Shape ReconstructionVocal Bursts Intensity Prediction3D Object ReconstructionLifelike 3D Human Generation3D Human Reconstruction3D Shape Reconstruction From A Single 2D Image3D Object Reconstruction From A Single Image
PaperPDFCode

Abstract

We introduce Pixel-aligned Implicit Function (PIFu), a highly effective implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object. Using PIFu, we propose an end-to-end deep learning method for digitizing highly detailed clothed humans that can infer both 3D surface and texture from a single image, and optionally, multiple input images. Highly intricate shapes, such as hairstyles, clothing, as well as their variations and deformations can be digitized in a unified way. Compared to existing representations used for 3D deep learning, PIFu can produce high-resolution surfaces including largely unseen regions such as the back of a person. In particular, it is memory efficient unlike the voxel representation, can handle arbitrary topology, and the resulting surface is spatially aligned with the input image. Furthermore, while previous techniques are designed to process either a single image or multiple views, PIFu extends naturally to arbitrary number of views. We demonstrate high-resolution and robust reconstructions on real world images from the DeepFashion dataset, which contains a variety of challenging clothing types. Our method achieves state-of-the-art performance on a public benchmark and outperforms the prior work for clothed human digitization from a single image.

Results

TaskDatasetMetricValueModel
ReconstructionCustomHumansChamfer Distance P-to-S2.209PIFu
ReconstructionCustomHumansChamfer Distance S-to-P2.582PIFu
ReconstructionCustomHumansNormal Consistency0.805PIFu
ReconstructionCustomHumansf-Score34.881PIFu
ReconstructionCAPEChamfer (cm)3.573PIFu (THuman2.0)
ReconstructionCAPENC0.186PIFu (THuman2.0)
ReconstructionCAPEP2S (cm)1.483PIFu (THuman2.0)
Reconstruction4D-DRESSChamfer (cm)2.696PIFu_Inner
Reconstruction4D-DRESSIoU0.69PIFu_Inner
Reconstruction4D-DRESSNormal Consistency0.792PIFu_Inner
Reconstruction4D-DRESSChamfer (cm)2.783PIFu_Outer
Reconstruction4D-DRESSIoU0.697PIFu_Outer
Reconstruction4D-DRESSNormal Consistency0.759PIFu_Outer
Object ReconstructionRenderPeopleChamfer (cm)0.567PIFu (3 views)
Object ReconstructionRenderPeoplePoint-to-surface distance (cm)0.554PIFu (3 views)
Object ReconstructionRenderPeopleSurface normal consistency0.094PIFu (3 views)
Object ReconstructionRenderPeopleChamfer (cm)1.5PIFu
Object ReconstructionRenderPeoplePoint-to-surface distance (cm)1.52PIFu
Object ReconstructionRenderPeopleSurface normal consistency0.084PIFu
Object ReconstructionBUFFChamfer (cm)1.14PIFu
Object ReconstructionBUFFPoint-to-surface distance (cm)1.15PIFu
Object ReconstructionBUFFSurface normal consistency0.0928PIFu
3D Object ReconstructionRenderPeopleChamfer (cm)0.567PIFu (3 views)
3D Object ReconstructionRenderPeoplePoint-to-surface distance (cm)0.554PIFu (3 views)
3D Object ReconstructionRenderPeopleSurface normal consistency0.094PIFu (3 views)
3D Object ReconstructionRenderPeopleChamfer (cm)1.5PIFu
3D Object ReconstructionRenderPeoplePoint-to-surface distance (cm)1.52PIFu
3D Object ReconstructionRenderPeopleSurface normal consistency0.084PIFu
3D Object ReconstructionBUFFChamfer (cm)1.14PIFu
3D Object ReconstructionBUFFPoint-to-surface distance (cm)1.15PIFu
3D Object ReconstructionBUFFSurface normal consistency0.0928PIFu
Lifelike 3D Human GenerationTHuman2.0 DatasetCLIP Similarity0.8501PIFu
Lifelike 3D Human GenerationTHuman2.0 DatasetLPIPS0.1615PIFu
Lifelike 3D Human GenerationTHuman2.0 DatasetPSNR15.0248PIFu
Lifelike 3D Human GenerationTHuman2.0 DatasetSSIM0.8884PIFu

Related Papers

Systematic Comparison of Projection Methods for Monocular 3D Human Pose Estimation on Fisheye Images2025-06-24ExtPose: Robust and Coherent Pose Estimation by Extending ViTs2025-06-18PoseGRAF: Geometric-Reinforced Adaptive Fusion for Monocular 3D Human Pose Estimation2025-06-17PF-LHM: 3D Animatable Avatar Reconstruction from Pose-free Articulated Human Images2025-06-16SMPL Normal Map Is All You Need for Single-view Textured Human Reconstruction2025-06-15HuSc3D: Human Sculpture dataset for 3D object reconstruction2025-06-09Object-X: Learning to Reconstruct Multi-Modal 3D Object Representations2025-06-05Learning Pyramid-structured Long-range Dependencies for 3D Human Pose Estimation2025-06-03