Fan Yang, Sakriani Sakti, Yang Wu, Satoshi Nakamura
Although skeleton-based action recognition has achieved great success in recent years, most of the existing methods may suffer from a large model size and slow execution speed. To alleviate this issue, we analyze skeleton sequence properties to propose a Double-feature Double-motion Network (DD-Net) for skeleton-based action recognition. By using a lightweight network structure (i.e., 0.15 million parameters), DD-Net can reach a super fast speed, as 3,500 FPS on one GPU, or, 2,000 FPS on one CPU. By employing robust features, DD-Net achieves the state-of-the-art performance on our experimental datasets: SHREC (i.e., hand actions) and JHMDB (i.e., body actions). Our code will be released with this paper later.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Video | JHMDB (2D poses only) | Average accuracy of 3 splits | 77.2 | DD-Net |
| Video | J-HMDB | Accuracy (pose) | 77.2 | DD-Net |
| Temporal Action Localization | JHMDB (2D poses only) | Average accuracy of 3 splits | 77.2 | DD-Net |
| Temporal Action Localization | J-HMDB | Accuracy (pose) | 77.2 | DD-Net |
| Zero-Shot Learning | JHMDB (2D poses only) | Average accuracy of 3 splits | 77.2 | DD-Net |
| Zero-Shot Learning | J-HMDB | Accuracy (pose) | 77.2 | DD-Net |
| Activity Recognition | JHMDB (2D poses only) | Average accuracy of 3 splits | 77.2 | DD-Net |
| Activity Recognition | J-HMDB | Accuracy (pose) | 77.2 | DD-Net |
| Action Localization | JHMDB (2D poses only) | Average accuracy of 3 splits | 77.2 | DD-Net |
| Action Localization | J-HMDB | Accuracy (pose) | 77.2 | DD-Net |
| Hand | DHG-28 | Accuracy | 91.9 | DD-Net |
| Hand | SHREC 2017 track on 3D Hand Gesture Recognition | 14 gestures accuracy | 94.6 | DD-Net |
| Hand | DHG-14 | Accuracy | 94.6 | DD-Net |
| Action Detection | JHMDB (2D poses only) | Average accuracy of 3 splits | 77.2 | DD-Net |
| Action Detection | J-HMDB | Accuracy (pose) | 77.2 | DD-Net |
| Gesture Recognition | DHG-28 | Accuracy | 91.9 | DD-Net |
| Gesture Recognition | SHREC 2017 track on 3D Hand Gesture Recognition | 14 gestures accuracy | 94.6 | DD-Net |
| Gesture Recognition | DHG-14 | Accuracy | 94.6 | DD-Net |
| 3D Action Recognition | JHMDB (2D poses only) | Average accuracy of 3 splits | 77.2 | DD-Net |
| 3D Action Recognition | J-HMDB | Accuracy (pose) | 77.2 | DD-Net |
| Action Recognition | JHMDB (2D poses only) | Average accuracy of 3 splits | 77.2 | DD-Net |
| Action Recognition | J-HMDB | Accuracy (pose) | 77.2 | DD-Net |