TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/DeepIM: Deep Iterative Matching for 6D Pose Estimation

DeepIM: Deep Iterative Matching for 6D Pose Estimation

Yi Li, Gu Wang, Xiangyang Ji, Yu Xiang, Dieter Fox

2018-03-31ECCV 2018 9Pose Estimation6D Pose Estimation using RGBRobot Manipulation6D Pose Estimation
PaperPDFCodeCode

Abstract

Estimating the 6D pose of objects from images is an important problem in various applications such as robot manipulation and virtual reality. While direct regression of images to object poses has limited accuracy, matching rendered images of an object against the observed image can produce accurate results. In this work, we propose a novel deep neural network for 6D pose matching named DeepIM. Given an initial pose estimation, our network is able to iteratively refine the pose by matching the rendered image against the observed image. The network is trained to predict a relative pose transformation using an untangled representation of 3D location and 3D orientation and an iterative training process. Experiments on two commonly used benchmarks for 6D pose estimation demonstrate that DeepIM achieves large improvements over state-of-the-art methods. We furthermore show that DeepIM is able to match previously unseen objects.

Results

TaskDatasetMetricValueModel
Pose EstimationLineMODAccuracy97.5PoseCNN + DeepIM
Pose EstimationLineMODMean ADD88.6PoseCNN + DeepIM
Pose EstimationOcclusion LineMODMean ADD55.5DeepIM (Train on Occlusion LineMOD)
Pose EstimationYCB-VideoMean ADD80.6PoseCNN + DeepIM
Pose EstimationYCB-VideoMean ADI92.4PoseCNN + DeepIM
3DLineMODAccuracy97.5PoseCNN + DeepIM
3DLineMODMean ADD88.6PoseCNN + DeepIM
3DOcclusion LineMODMean ADD55.5DeepIM (Train on Occlusion LineMOD)
3DYCB-VideoMean ADD80.6PoseCNN + DeepIM
3DYCB-VideoMean ADI92.4PoseCNN + DeepIM
1 Image, 2*2 StitchiLineMODAccuracy97.5PoseCNN + DeepIM
1 Image, 2*2 StitchiLineMODMean ADD88.6PoseCNN + DeepIM
1 Image, 2*2 StitchiOcclusion LineMODMean ADD55.5DeepIM (Train on Occlusion LineMOD)
1 Image, 2*2 StitchiYCB-VideoMean ADD80.6PoseCNN + DeepIM
1 Image, 2*2 StitchiYCB-VideoMean ADI92.4PoseCNN + DeepIM

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16