Sulabh Kumra, Shirin Joshi, Ferat Sahin
In this paper, we present a modular robotic system to tackle the problem of generating and performing antipodal robotic grasps for unknown objects from n-channel image of the scene. We propose a novel Generative Residual Convolutional Neural Network (GR-ConvNet) model that can generate robust antipodal grasps from n-channel input at real-time speeds (~20ms). We evaluate the proposed model architecture on standard datasets and a diverse set of household objects. We achieved state-of-the-art accuracy of 97.7% and 94.6% on Cornell and Jacquard grasping datasets respectively. We also demonstrate a grasp success rate of 95.4% and 93% on household and adversarial objects respectively using a 7 DoF robotic arm.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Robotic Grasping | Cornell Grasp Dataset | 5 fold cross validation | 97.7 | GR-ConvNet |
| Robotic Grasping | Jacquard dataset | Accuracy (%) | 94.6 | GR-ConvNet |