TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Normalized Object Coordinate Space for Category-Level 6D O...

Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation

He Wang, Srinath Sridhar, Jingwei Huang, Julien Valentin, Shuran Song, Leonidas J. Guibas

2019-01-09CVPR 2019 6Mixed RealityPose Estimation6D Pose Estimation using RGB6D Pose Estimation
PaperPDFCodeCodeCodeCodeCodeCodeCodeCodeCode(official)Code

Abstract

The goal of this paper is to estimate the 6D pose and dimensions of unseen object instances in an RGB-D image. Contrary to "instance-level" 6D pose estimation tasks, our problem assumes that no exact object CAD models are available during either training or testing time. To handle different and unseen object instances in a given category, we introduce a Normalized Object Coordinate Space (NOCS)---a shared canonical representation for all possible object instances within a category. Our region-based neural network is then trained to directly infer the correspondence from observed pixels to this shared object representation (NOCS) along with other object information such as class label and instance mask. These predictions can be combined with the depth map to jointly estimate the metric 6D pose and dimensions of multiple objects in a cluttered scene. To train our network, we present a new context-aware technique to generate large amounts of fully annotated mixed reality data. To further improve our model and evaluate its performance on real data, we also provide a fully annotated real-world dataset with large environment and instance variation. Extensive experiments demonstrate that the proposed method is able to robustly estimate the pose and size of unseen object instances in real environments while also achieving state-of-the-art performance on standard 6D pose estimation benchmarks.

Results

TaskDatasetMetricValueModel
Pose EstimationREAL275mAP 10, 10cm26.7NOCS (128 bins)
Pose EstimationREAL275mAP 10, 5cm26.7NOCS (128 bins)
Pose EstimationREAL275mAP 3DIou@2584.9NOCS (128 bins)
Pose EstimationREAL275mAP 3DIou@5080.5NOCS (128 bins)
Pose EstimationREAL275mAP 5, 5cm9.5NOCS (128 bins)
Pose EstimationCAMERA25mAP 10, 10cm62.2NOCS (128 bins)
Pose EstimationCAMERA25mAP 10, 5cm61.7NOCS (128 bins)
Pose EstimationCAMERA25mAP 3DIou@2591.4NOCS (128 bins)
Pose EstimationCAMERA25mAP 3DIou@5085.3NOCS (128 bins)
Pose EstimationCAMERA25mAP 5, 5cm38.8NOCS (128 bins)
3DREAL275mAP 10, 10cm26.7NOCS (128 bins)
3DREAL275mAP 10, 5cm26.7NOCS (128 bins)
3DREAL275mAP 3DIou@2584.9NOCS (128 bins)
3DREAL275mAP 3DIou@5080.5NOCS (128 bins)
3DREAL275mAP 5, 5cm9.5NOCS (128 bins)
3DCAMERA25mAP 10, 10cm62.2NOCS (128 bins)
3DCAMERA25mAP 10, 5cm61.7NOCS (128 bins)
3DCAMERA25mAP 3DIou@2591.4NOCS (128 bins)
3DCAMERA25mAP 3DIou@5085.3NOCS (128 bins)
3DCAMERA25mAP 5, 5cm38.8NOCS (128 bins)
1 Image, 2*2 StitchiREAL275mAP 10, 10cm26.7NOCS (128 bins)
1 Image, 2*2 StitchiREAL275mAP 10, 5cm26.7NOCS (128 bins)
1 Image, 2*2 StitchiREAL275mAP 3DIou@2584.9NOCS (128 bins)
1 Image, 2*2 StitchiREAL275mAP 3DIou@5080.5NOCS (128 bins)
1 Image, 2*2 StitchiREAL275mAP 5, 5cm9.5NOCS (128 bins)
1 Image, 2*2 StitchiCAMERA25mAP 10, 10cm62.2NOCS (128 bins)
1 Image, 2*2 StitchiCAMERA25mAP 10, 5cm61.7NOCS (128 bins)
1 Image, 2*2 StitchiCAMERA25mAP 3DIou@2591.4NOCS (128 bins)
1 Image, 2*2 StitchiCAMERA25mAP 3DIou@5085.3NOCS (128 bins)
1 Image, 2*2 StitchiCAMERA25mAP 5, 5cm38.8NOCS (128 bins)

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16