TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CenterSnap: Single-Shot Multi-Object 3D Shape Reconstructi...

CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation

Muhammad Zubair Irshad, Thomas Kollar, Michael Laskey, Kevin Stone, Zsolt Kira

2022-03-033D Shape ReconstructionPose Estimation3D Reconstruction6D Pose Estimation using RGBD6D Pose Estimation
PaperPDFCodeCodeCode

Abstract

This paper studies the complex task of simultaneous multi-object 3D reconstruction, 6D pose and size estimation from a single-view RGB-D observation. In contrast to instance-level pose estimation, we focus on a more challenging problem where CAD models are not available at inference time. Existing approaches mainly follow a complex multi-stage pipeline which first localizes and detects each object instance in the image and then regresses to either their 3D meshes or 6D poses. These approaches suffer from high-computational cost and low performance in complex multi-object scenarios, where occlusions can be present. Hence, we present a simple one-stage approach to predict both the 3D shape and estimate the 6D pose and size jointly in a bounding-box free manner. In particular, our method treats object instances as spatial centers where each center denotes the complete shape of an object along with its 6D pose and size. Through this per-pixel representation, our approach can reconstruct in real-time (40 FPS) multiple novel object instances and predict their 6D pose and sizes in a single-forward pass. Through extensive experiments, we demonstrate that our approach significantly outperforms all shape completion and categorical 6D pose and size estimation baselines on multi-object ShapeNet and NOCS datasets respectively with a 12.6% absolute improvement in mAP for 6D pose for novel real-world object instances.

Results

TaskDatasetMetricValueModel
Pose EstimationREAL275mAP 10, 10cm70.9CenterSnap
Pose EstimationREAL275mAP 10, 5cm64.3CenterSnap
Pose EstimationREAL275mAP 3DIou@2583.5CenterSnap
Pose EstimationREAL275mAP 3DIou@5080.2CenterSnap
Pose EstimationREAL275mAP 5, 5cm29.1CenterSnap
Pose EstimationCAMERA25mAP 10, 10cm87.9CenterSnap
Pose EstimationCAMERA25mAP 10, 5cm81.3CenterSnap
Pose EstimationCAMERA25mAP 3DIou@2593.2CenterSnap
Pose EstimationCAMERA25mAP 3DIou@5092.5CenterSnap
Pose EstimationCAMERA25mAP 5, 5cm66.2CenterSnap
3DREAL275mAP 10, 10cm70.9CenterSnap
3DREAL275mAP 10, 5cm64.3CenterSnap
3DREAL275mAP 3DIou@2583.5CenterSnap
3DREAL275mAP 3DIou@5080.2CenterSnap
3DREAL275mAP 5, 5cm29.1CenterSnap
3DCAMERA25mAP 10, 10cm87.9CenterSnap
3DCAMERA25mAP 10, 5cm81.3CenterSnap
3DCAMERA25mAP 3DIou@2593.2CenterSnap
3DCAMERA25mAP 3DIou@5092.5CenterSnap
3DCAMERA25mAP 5, 5cm66.2CenterSnap
1 Image, 2*2 StitchiREAL275mAP 10, 10cm70.9CenterSnap
1 Image, 2*2 StitchiREAL275mAP 10, 5cm64.3CenterSnap
1 Image, 2*2 StitchiREAL275mAP 3DIou@2583.5CenterSnap
1 Image, 2*2 StitchiREAL275mAP 3DIou@5080.2CenterSnap
1 Image, 2*2 StitchiREAL275mAP 5, 5cm29.1CenterSnap
1 Image, 2*2 StitchiCAMERA25mAP 10, 10cm87.9CenterSnap
1 Image, 2*2 StitchiCAMERA25mAP 10, 5cm81.3CenterSnap
1 Image, 2*2 StitchiCAMERA25mAP 3DIou@2593.2CenterSnap
1 Image, 2*2 StitchiCAMERA25mAP 3DIou@5092.5CenterSnap
1 Image, 2*2 StitchiCAMERA25mAP 5, 5cm66.2CenterSnap

Related Papers

$π^3$: Scalable Permutation-Equivariant Visual Geometry Learning2025-07-17Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark2025-07-17DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model2025-07-17From Neck to Head: Bio-Impedance Sensing for Head Pose Estimation2025-07-17AthleticsPose: Authentic Sports Motion Dataset on Athletic Field and Evaluation of Monocular 3D Pose Estimation Ability2025-07-17AutoPartGen: Autogressive 3D Part Generation and Discovery2025-07-17SpatialTrackerV2: 3D Point Tracking Made Easy2025-07-16SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation2025-07-16