Fusing Visual Appearance and Geometry for Multi-modality 6DoF Object Tracking

Manuel Stoiber, Mariam Elsayed, Anne E. Reichert, Florian Steidle, Dongheui Lee, Rudolph Triebel

2023-02-223D Object Tracking Object Tracking 6D Pose Estimation using RGBD 6D Pose Estimation

Abstract

In many applications of advanced robotic manipulation, six degrees of freedom (6DoF) object pose estimates are continuously required. In this work, we develop a multi-modality tracker that fuses information from visual appearance and geometry to estimate object poses. The algorithm extends our previous method ICG, which uses geometry, to additionally consider surface appearance. In general, object surfaces contain local characteristics from text, graphics, and patterns, as well as global differences from distinct materials and colors. To incorporate this visual information, two modalities are developed. For local characteristics, keypoint features are used to minimize distances between points from keyframes and the current image. For global differences, a novel region approach is developed that considers multiple regions on the object surface. In addition, it allows the modeling of external geometries. Experiments on the YCB-Video and OPT datasets demonstrate that our approach ICG+ performs best on both datasets, outperforming both conventional and deep learning-based methods. At the same time, the algorithm is highly efficient and runs at more than 300 Hz. The source code of our tracker is publicly available.

Results

Task	Dataset	Metric	Value	Model
Pose Estimation	YCB-Video	ADDS AUC	97.9	ICG+
Pose Estimation	OPT	AUC	17.57	ICG+
3D	YCB-Video	ADDS AUC	97.9	ICG+
3D	OPT	AUC	17.57	ICG+
6D Pose Estimation	YCB-Video	ADDS AUC	97.9	ICG+
6D Pose Estimation	OPT	AUC	17.57	ICG+
1 Image, 2*2 Stitchi	YCB-Video	ADDS AUC	97.9	ICG+
1 Image, 2*2 Stitchi	OPT	AUC	17.57	ICG+

Related Papers

MVA 2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results2025-07-17 YOLOv8-SMOT: An Efficient and Robust Framework for Real-Time Small Object Tracking via Slice-Assisted Training and Adaptive Association2025-07-16 HiM2SAM: Enhancing SAM2 with Hierarchical Motion Estimation and Memory Optimization towards Long-term Tracking2025-07-10 SenseShift6D: Multimodal RGB-D Benchmarking for Robust 6D Pose Estimation across Environment and Sensor Variations2025-07-08 Robustifying 3D Perception through Least-Squares Multi-Agent Graphs Object Tracking2025-07-07 UMDATrack: Unified Multi-Domain Adaptive Tracking Under Adverse Weather Conditions2025-07-01 Mamba-FETrack V2: Revisiting State Space Model for Frame-Event based Visual Object Tracking2025-06-30 Visual and Memory Dual Adapter for Multi-Modal Object Tracking2025-06-30