TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/3D-LFM: Lifting Foundation Model

3D-LFM: Lifting Foundation Model

Mosam Dabhi, Laszlo A. Jeni, Simon Lucey

2023-12-19CVPR 2024 13D Human Pose Estimation3D Hand Pose EstimationMonocular 3D Human Pose EstimationPose Estimation3D Facial Landmark Localization
PaperPDFCode(official)

Abstract

The lifting of 3D structure and camera from 2D landmarks is at the cornerstone of the entire discipline of computer vision. Traditional methods have been confined to specific rigid objects, such as those in Perspective-n-Point (PnP) problems, but deep learning has expanded our capability to reconstruct a wide range of object classes (e.g. C3DPO and PAUL) with resilience to noise, occlusions, and perspective distortions. All these techniques, however, have been limited by the fundamental need to establish correspondences across the 3D training data -- significantly limiting their utility to applications where one has an abundance of "in-correspondence" 3D data. Our approach harnesses the inherent permutation equivariance of transformers to manage varying number of points per 3D data instance, withstands occlusions, and generalizes to unseen categories. We demonstrate state of the art performance across 2D-3D lifting task benchmarks. Since our approach can be trained across such a broad class of structures we refer to it simply as a 3D Lifting Foundation Model (3D-LFM) -- the first of its kind.

Results

TaskDatasetMetricValueModel
Facial Recognition and ModellingH3WBAverage MPJPE (mm)10.443D-LFM
3D Human Pose EstimationH3WBMPJPE60.833D-LFM
HandH3WBAverage MPJPE (mm)28.223D-LFM
Pose EstimationH3WBMPJPE60.833D-LFM
Pose EstimationH3WBAverage MPJPE (mm)28.223D-LFM
Hand Pose EstimationH3WBAverage MPJPE (mm)28.223D-LFM
Facial Landmark DetectionH3WBAverage MPJPE (mm)10.443D-LFM
Face ReconstructionH3WBAverage MPJPE (mm)10.443D-LFM
3DH3WBMPJPE60.833D-LFM
3DH3WBAverage MPJPE (mm)28.223D-LFM
3DH3WBAverage MPJPE (mm)10.443D-LFM
3D Face ModellingH3WBAverage MPJPE (mm)10.443D-LFM
3D Face ReconstructionH3WBAverage MPJPE (mm)10.443D-LFM
3D Hand Pose EstimationH3WBAverage MPJPE (mm)28.223D-LFM
1 Image, 2*2 StitchiH3WBMPJPE60.833D-LFM
1 Image, 2*2 StitchiH3WBAverage MPJPE (mm)28.223D-LFM

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16