TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/GenPose: Generative Category-level Object Pose Estimation ...

GenPose: Generative Category-level Object Pose Estimation via Diffusion Models

Jiyao Zhang, Mingdong Wu, Hao Dong

2023-06-18Pose EstimationPose Tracking6D Pose Estimation using RGBD6D Pose Estimation
PaperPDF

Abstract

Object pose estimation plays a vital role in embodied AI and computer vision, enabling intelligent agents to comprehend and interact with their surroundings. Despite the practicality of category-level pose estimation, current approaches encounter challenges with partially observed point clouds, known as the multihypothesis issue. In this study, we propose a novel solution by reframing categorylevel object pose estimation as conditional generative modeling, departing from traditional point-to-point regression. Leveraging score-based diffusion models, we estimate object poses by sampling candidates from the diffusion model and aggregating them through a two-step process: filtering out outliers via likelihood estimation and subsequently mean-pooling the remaining candidates. To avoid the costly integration process when estimating the likelihood, we introduce an alternative method that trains an energy-based model from the original score-based model, enabling end-to-end likelihood estimation. Our approach achieves state-of-the-art performance on the REAL275 dataset, surpassing 50% and 60% on strict 5d2cm and 5d5cm metrics, respectively. Furthermore, our method demonstrates strong generalizability to novel categories sharing similar symmetric properties without fine-tuning and can readily adapt to object pose tracking tasks, yielding comparable results to the current state-of-the-art baselines.

Results

TaskDatasetMetricValueModel
Pose EstimationREAL275mAP 10, 2cm72.4GenPose https://github.com/Jiyao06/GenPose
Pose EstimationREAL275mAP 10, 5cm84GenPose https://github.com/Jiyao06/GenPose
Pose EstimationREAL275mAP 5, 2cm52.1GenPose https://github.com/Jiyao06/GenPose
Pose EstimationREAL275mAP 5, 5cm60.9GenPose https://github.com/Jiyao06/GenPose
3DREAL275mAP 10, 2cm72.4GenPose https://github.com/Jiyao06/GenPose
3DREAL275mAP 10, 5cm84GenPose https://github.com/Jiyao06/GenPose
3DREAL275mAP 5, 2cm52.1GenPose https://github.com/Jiyao06/GenPose
3DREAL275mAP 5, 5cm60.9GenPose https://github.com/Jiyao06/GenPose
1 Image, 2*2 StitchiREAL275mAP 10, 2cm72.4GenPose https://github.com/Jiyao06/GenPose
1 Image, 2*2 StitchiREAL275mAP 10, 5cm84GenPose https://github.com/Jiyao06/GenPose
1 Image, 2*2 StitchiREAL275mAP 5, 2cm52.1GenPose https://github.com/Jiyao06/GenPose
1 Image, 2*2 StitchiREAL275mAP 5, 5cm60.9GenPose https://github.com/Jiyao06/GenPose

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16