One2Any: One-Reference 6D Pose Estimation for Any Object

Mengya Liu, Siyuan Li, Ajad Chhatkuli, Prune Truong, Luc van Gool, Federico Tombari

2025-05-07CVPR 2025 1Pose Estimation 6D Pose Estimation using RGB 6D Pose Estimation

Abstract

6D object pose estimation remains challenging for many applications due to dependencies on complete 3D models, multi-view images, or training limited to specific object categories. These requirements make generalization to novel objects difficult for which neither 3D models nor multi-view images may be available. To address this, we propose a novel method One2Any that estimates the relative 6-degrees of freedom (DOF) object pose using only a single reference-single query RGB-D image, without prior knowledge of its 3D model, multi-view data, or category constraints. We treat object pose estimation as an encoding-decoding process, first, we obtain a comprehensive Reference Object Pose Embedding (ROPE) that encodes an object shape, orientation, and texture from a single reference view. Using this embedding, a U-Net-based pose decoding module produces Reference Object Coordinate (ROC) for new views, enabling fast and accurate pose estimation. This simple encoding-decoding framework allows our model to be trained on any pair-wise pose data, enabling large-scale training and demonstrating great scalability. Experiments on multiple benchmark datasets demonstrate that our model generalizes well to novel objects, achieving state-of-the-art accuracy and robustness even rivaling methods that require multi-view or CAD inputs, at a fraction of compute.

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17 Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17 DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17 From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17 AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17 SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16 SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16 Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16