TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/GPV-Pose: Category-level Object Pose Estimation via Geomet...

GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting

Yan Di, Ruida Zhang, Zhiqiang Lou, Fabian Manhardt, Xiangyang Ji, Nassir Navab, Federico Tombari

2022-03-15CVPR 2022 1Pose Estimation6D Pose Estimation using RGBRetrieval6D Pose Estimation using RGBD6D Pose Estimation
PaperPDFCode(official)CodeCode

Abstract

While 6D object pose estimation has recently made a huge leap forward, most methods can still only handle a single or a handful of different objects, which limits their applications. To circumvent this problem, category-level object pose estimation has recently been revamped, which aims at predicting the 6D pose as well as the 3D metric size for previously unseen instances from a given set of object classes. This is, however, a much more challenging task due to severe intra-class shape variations. To address this issue, we propose GPV-Pose, a novel framework for robust category-level pose estimation, harnessing geometric insights to enhance the learning of category-level pose-sensitive features. First, we introduce a decoupled confidence-driven rotation representation, which allows geometry-aware recovery of the associated rotation matrix. Second, we propose a novel geometry-guided point-wise voting paradigm for robust retrieval of the 3D object bounding box. Finally, leveraging these different output streams, we can enforce several geometric consistency terms, further increasing performance, especially for non-symmetric categories. GPV-Pose produces superior results to state-of-the-art competitors on common public benchmarks, whilst almost achieving real-time inference speed at 20 FPS.

Results

TaskDatasetMetricValueModel
Pose EstimationLineMODMean ADD-S98.2GPV-Pose
Pose EstimationREAL275FPS20GPV-Pose
Pose EstimationREAL275mAP 10, 10cm74.6GPV-Pose
Pose EstimationREAL275mAP 10, 5cm73.3GPV-Pose
Pose EstimationREAL275mAP 3DIou@2584.2GPV-Pose
Pose EstimationREAL275mAP 3DIou@5083GPV-Pose
Pose EstimationREAL275mAP 3DIou@7564.4GPV-Pose
Pose EstimationREAL275mAP 5, 2cm32GPV-Pose
Pose EstimationREAL275mAP 5, 5cm42.9GPV-Pose
3DLineMODMean ADD-S98.2GPV-Pose
3DREAL275FPS20GPV-Pose
3DREAL275mAP 10, 10cm74.6GPV-Pose
3DREAL275mAP 10, 5cm73.3GPV-Pose
3DREAL275mAP 3DIou@2584.2GPV-Pose
3DREAL275mAP 3DIou@5083GPV-Pose
3DREAL275mAP 3DIou@7564.4GPV-Pose
3DREAL275mAP 5, 2cm32GPV-Pose
3DREAL275mAP 5, 5cm42.9GPV-Pose
6D Pose EstimationLineMODMean ADD-S98.2GPV-Pose
1 Image, 2*2 StitchiLineMODMean ADD-S98.2GPV-Pose
1 Image, 2*2 StitchiREAL275FPS20GPV-Pose
1 Image, 2*2 StitchiREAL275mAP 10, 10cm74.6GPV-Pose
1 Image, 2*2 StitchiREAL275mAP 10, 5cm73.3GPV-Pose
1 Image, 2*2 StitchiREAL275mAP 3DIou@2584.2GPV-Pose
1 Image, 2*2 StitchiREAL275mAP 3DIou@5083GPV-Pose
1 Image, 2*2 StitchiREAL275mAP 3DIou@7564.4GPV-Pose
1 Image, 2*2 StitchiREAL275mAP 5, 2cm32GPV-Pose
1 Image, 2*2 StitchiREAL275mAP 5, 5cm42.9GPV-Pose

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17HapticCap: A Multimodal Dataset and Task for Understanding User Experience of Vibration Haptic Signals2025-07-17A Survey of Context Engineering for Large Language Models2025-07-17