Chen Wang, Roberto Martín-Martín, Danfei Xu, Jun Lv, Cewu Lu, Li Fei-Fei, Silvio Savarese, Yuke Zhu
We present 6-PACK, a deep learning approach to category-level 6D object pose tracking on RGB-D data. Our method tracks in real-time novel object instances of known object categories such as bowls, laptops, and mugs. 6-PACK learns to compactly represent an object by a handful of 3D keypoints, based on which the interframe motion of an object instance can be estimated through keypoint matching. These keypoints are learned end-to-end without manual supervision in order to be most effective for tracking. Our experiments show that our method substantially outperforms existing methods on the NOCS category-level 6D pose estimation benchmark and supports a physical robot to perform simple vision-based closed-loop manipulation tasks. Our code and video are available at https://sites.google.com/view/6packtracking.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Pose Estimation | REAL275 | Rerr | 16 | 6-PACK |
| Pose Estimation | REAL275 | Terr | 3.5 | 6-PACK |
| Pose Estimation | REAL275 | mAP 3DIou@25 | 94.2 | 6-PACK |
| Pose Estimation | REAL275 | mAP 5, 5cm | 33.3 | 6-PACK |
| 3D | REAL275 | Rerr | 16 | 6-PACK |
| 3D | REAL275 | Terr | 3.5 | 6-PACK |
| 3D | REAL275 | mAP 3DIou@25 | 94.2 | 6-PACK |
| 3D | REAL275 | mAP 5, 5cm | 33.3 | 6-PACK |
| 1 Image, 2*2 Stitchi | REAL275 | Rerr | 16 | 6-PACK |
| 1 Image, 2*2 Stitchi | REAL275 | Terr | 3.5 | 6-PACK |
| 1 Image, 2*2 Stitchi | REAL275 | mAP 3DIou@25 | 94.2 | 6-PACK |
| 1 Image, 2*2 Stitchi | REAL275 | mAP 5, 5cm | 33.3 | 6-PACK |