TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/FS-Net: Fast Shape-based Network for Category-Level 6D Obj...

FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism

Wei Chen, Xi Jia, Hyung Jin Chang, Jinming Duan, Linlin Shen, Ales Leonardis

2021-03-12CVPR 2021 1TranslationPose Estimation6D Pose Estimation using RGB6D Pose Estimation using RGBD6D Pose Estimation
PaperPDFCode(official)Code

Abstract

In this paper, we focus on category-level 6D pose and size estimation from monocular RGB-D image. Previous methods suffer from inefficient category-level pose feature extraction which leads to low accuracy and inference speed. To tackle this problem, we propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation. First, we design an orientation aware autoencoder with 3D graph convolution for latent feature extraction. The learned latent feature is insensitive to point shift and object size thanks to the shift and scale-invariance properties of the 3D graph convolution. Then, to efficiently decode category-level rotation information from the latent feature, we propose a novel decoupled rotation mechanism that employs two decoders to complementarily access the rotation information. Meanwhile, we estimate translation and size by two residuals, which are the difference between the mean of object points and ground truth translation, and the difference between the mean size of the category and ground truth size, respectively. Finally, to increase the generalization ability of FS-Net, we propose an online box-cage based 3D deformation mechanism to augment the training data. Extensive experiments on two benchmark datasets show that the proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation. Especially in category-level pose estimation, without extra synthetic data, our method outperforms existing methods by 6.3% on the NOCS-REAL dataset.

Results

TaskDatasetMetricValueModel
Pose EstimationREAL275FPS20FS-Net
Pose EstimationREAL275mAP 10, 10cm64.6FS-Net
Pose EstimationREAL275mAP 10, 5cm60.8FS-Net
Pose EstimationREAL275mAP 3DIou@2595.1FS-Net
Pose EstimationREAL275mAP 3DIou@5092.2FS-Net
Pose EstimationREAL275mAP 3DIou@7563.5FS-Net
Pose EstimationREAL275mAP 5, 5cm28.2FS-Net
3DREAL275FPS20FS-Net
3DREAL275mAP 10, 10cm64.6FS-Net
3DREAL275mAP 10, 5cm60.8FS-Net
3DREAL275mAP 3DIou@2595.1FS-Net
3DREAL275mAP 3DIou@5092.2FS-Net
3DREAL275mAP 3DIou@7563.5FS-Net
3DREAL275mAP 5, 5cm28.2FS-Net
1 Image, 2*2 StitchiREAL275FPS20FS-Net
1 Image, 2*2 StitchiREAL275mAP 10, 10cm64.6FS-Net
1 Image, 2*2 StitchiREAL275mAP 10, 5cm60.8FS-Net
1 Image, 2*2 StitchiREAL275mAP 3DIou@2595.1FS-Net
1 Image, 2*2 StitchiREAL275mAP 3DIou@5092.2FS-Net
1 Image, 2*2 StitchiREAL275mAP 3DIou@7563.5FS-Net
1 Image, 2*2 StitchiREAL275mAP 5, 5cm28.2FS-Net

Related Papers

A Translation of Probabilistic Event Calculus into Markov Decision Processes2025-07-17$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16