TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Human-VDM: Learning Single-Image 3D Human Gaussian Splatti...

Human-VDM: Learning Single-Image 3D Human Gaussian Splatting from Video Diffusion Models

Zhibin Liu, Haoye Dong, Aviral Chharia, Hefeng Wu

2024-09-04Lifelike 3D Human Generation
PaperPDFCode(official)

Abstract

Generating lifelike 3D humans from a single RGB image remains a challenging task in computer vision, as it requires accurate modeling of geometry, high-quality texture, and plausible unseen parts. Existing methods typically use multi-view diffusion models for 3D generation, but they often face inconsistent view issues, which hinder high-quality 3D human generation. To address this, we propose Human-VDM, a novel method for generating 3D human from a single RGB image using Video Diffusion Models. Human-VDM provides temporally consistent views for 3D human generation using Gaussian Splatting. It consists of three modules: a view-consistent human video diffusion module, a video augmentation module, and a Gaussian Splatting module. First, a single image is fed into a human video diffusion module to generate a coherent human video. Next, the video augmentation module applies super-resolution and video interpolation to enhance the textures and geometric smoothness of the generated video. Finally, the 3D Human Gaussian Splatting module learns lifelike humans under the guidance of these high-resolution and view-consistent images. Experiments demonstrate that Human-VDM achieves high-quality 3D human from a single image, outperforming state-of-the-art methods in both generation quality and quantity. Project page: https://human-vdm.github.io/Human-VDM/

Results

TaskDatasetMetricValueModel
Lifelike 3D Human GenerationTHuman2.0 DatasetCLIP Similarity0.9235Human-VDM
Lifelike 3D Human GenerationTHuman2.0 DatasetLPIPS0.0957Human-VDM
Lifelike 3D Human GenerationTHuman2.0 DatasetPSNR20.068Human-VDM
Lifelike 3D Human GenerationTHuman2.0 DatasetSSIM0.9228Human-VDM

Related Papers

Ultraman: Single Image 3D Human Reconstruction with Ultra Speed and Detail2024-03-18SIFU: Side-view Conditioned Implicit Function for Real-world Usable Clothed Human Reconstruction2023-12-10SiTH: Single-view Textured Human Reconstruction with Image-Conditioned Diffusion2023-11-27PaMIR: Parametric Model-Conditioned Implicit Representation for Image-based Human Reconstruction2020-07-08PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization2019-05-13