UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition

Di Yang, Yaohui Wang, Antitza Dantcheva, Lorenzo Garattoni, Gianpiero Francesca, Francois Bremond

2021-07-19Action Classification Skeleton Based Action Recognition Action Recognition

Abstract

Action recognition based on skeleton data has recently witnessed increasing attention and progress. State-of-the-art approaches adopting Graph Convolutional networks (GCNs) can effectively extract features on human skeletons relying on the pre-defined human topology. Despite associated progress, GCN-based methods have difficulties to generalize across domains, especially with different human topological structures. In this context, we introduce UNIK, a novel skeleton-based action recognition method that is not only effective to learn spatio-temporal features on human skeleton sequences but also able to generalize across datasets. This is achieved by learning an optimal dependency matrix from the uniform distribution based on a multi-head attention mechanism. Subsequently, to study the cross-domain generalizability of skeleton-based action recognition in real-world videos, we re-evaluate state-of-the-art approaches as well as the proposed UNIK in light of a novel Posetics dataset. This dataset is created from Kinetics-400 videos by estimating, refining and filtering poses. We provide an analysis on how much performance improves on smaller benchmark datasets after pre-training on Posetics for the action classification task. Experimental results show that the proposed UNIK, with pre-training on Posetics, generalizes well and outperforms state-of-the-art when transferred onto four target action classification datasets: Toyota Smarthome, Penn Action, NTU-RGB+D 60 and NTU-RGB+D 120.

Results

Task	Dataset	Metric	Value	Model
Video	UPenn Action	Accuracy	97.9	UNIK
Video	Toyota Smarthome dataset	CS	64.3	UNIK
Video	Toyota Smarthome dataset	CV1	36.1	UNIK
Video	Toyota Smarthome dataset	CV2	65	UNIK
Temporal Action Localization	UPenn Action	Accuracy	97.9	UNIK
Zero-Shot Learning	UPenn Action	Accuracy	97.9	UNIK
Activity Recognition	UPenn Action	Accuracy	97.9	UNIK
Action Localization	UPenn Action	Accuracy	97.9	UNIK
Action Detection	UPenn Action	Accuracy	97.9	UNIK
3D Action Recognition	UPenn Action	Accuracy	97.9	UNIK
Action Recognition	UPenn Action	Accuracy	97.9	UNIK

UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition

Abstract

Results

Related Papers

UNIK: A Unified Framework for Real-world Skeleton-based Action Recognition

Abstract

Results

Related Papers