TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Occlusion-Robust Object Pose Estimation with Holistic Repr...

Occlusion-Robust Object Pose Estimation with Holistic Representation

Bo Chen, Tat-Jun Chin, Marius Klimavicius

2021-10-22Representation LearningPose Estimation6D Pose Estimation using RGB
PaperPDFCode(official)

Abstract

Practical object pose estimation demands robustness against occlusions to the target object. State-of-the-art (SOTA) object pose estimators take a two-stage approach, where the first stage predicts 2D landmarks using a deep network and the second stage solves for 6DOF pose from 2D-3D correspondences. Albeit widely adopted, such two-stage approaches could suffer from novel occlusions when generalising and weak landmark coherence due to disrupted features. To address these issues, we develop a novel occlude-and-blackout batch augmentation technique to learn occlusion-robust deep features, and a multi-precision supervision architecture to encourage holistic pose representation learning for accurate and coherent landmark predictions. We perform careful ablation tests to verify the impact of our innovations and compare our method to SOTA pose estimators. Without the need of any post-processing or refinement, our method exhibits superior performance on the LINEMOD dataset. On the YCB-Video dataset our method outperforms all non-refinement methods in terms of the ADD(-S) metric. We also demonstrate the high data-efficiency of our method. Our code is available at http://github.com/BoChenYS/ROPE

Results

TaskDatasetMetricValueModel
Pose EstimationYCB-VideoMean ADD66.59ROPE
Pose EstimationYCB-VideoMean AUC79.88ROPE
Pose EstimationLineMODMean ADD95.61ROPE
Pose EstimationOcclusion LineMODMean ADD45.95ROPE
3DYCB-VideoMean ADD66.59ROPE
3DYCB-VideoMean AUC79.88ROPE
3DLineMODMean ADD95.61ROPE
3DOcclusion LineMODMean ADD45.95ROPE
1 Image, 2*2 StitchiYCB-VideoMean ADD66.59ROPE
1 Image, 2*2 StitchiYCB-VideoMean AUC79.88ROPE
1 Image, 2*2 StitchiLineMODMean ADD95.61ROPE
1 Image, 2*2 StitchiOcclusion LineMODMean ADD45.95ROPE

Related Papers

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Boosting Team Modeling through Tempo-Relational Representation Learning2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17