CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition

Yuhang Wen, Mengyuan Liu, Songtao Wu, Beichen Ding

2024-10-093D Action Recognition Skeleton Based Action Recognition Group Activity Recognition Action Recognition Human Interaction Recognition

Paper PDF Code(official)

Abstract

Skeleton-based multi-entity action recognition is a challenging task aiming to identify interactive actions or group activities involving multiple diverse entities. Existing models for individuals often fall short in this task due to the inherent distribution discrepancies among entity skeletons, leading to suboptimal backbone optimization. To this end, we introduce a Convex Hull Adaptive Shift based multi-Entity action recognition method (CHASE), which mitigates inter-entity distribution gaps and unbiases subsequent backbones. Specifically, CHASE comprises a learnable parameterized network and an auxiliary objective. The parameterized network achieves plausible, sample-adaptive repositioning of skeleton sequences through two key components. First, the Implicit Convex Hull Constrained Adaptive Shift ensures that the new origin of the coordinate system is within the skeleton convex hull. Second, the Coefficient Learning Block provides a lightweight parameterization of the mapping from skeleton sequences to their specific coefficients in convex combinations. Moreover, to guide the optimization of this network for discrepancy minimization, we propose the Mini-batch Pair-wise Maximum Mean Discrepancy as the additional objective. CHASE operates as a sample-adaptive normalization method to mitigate inter-entity distribution discrepancies, thereby reducing data bias and improving the subsequent classifier's multi-entity action recognition performance. Extensive experiments on six datasets, including NTU Mutual 11/26, H2O, Assembly101, Collective Activity and Volleyball, consistently verify our approach by seamlessly adapting to single-entity backbones and boosting their performance in multi-entity scenarios. Our code is publicly available at https://github.com/Necolizer/CHASE .

Results

Task	Dataset	Metric	Value	Model
Video	Assembly101	Actions Top-1	28.03	CHASE(CTR-GCN)
Video	H2O (2 Hands and Objects)	Accuracy	94.77	CHASE(STSA-Net)
Temporal Action Localization	Assembly101	Actions Top-1	28.03	CHASE(CTR-GCN)
Temporal Action Localization	H2O (2 Hands and Objects)	Accuracy	94.77	CHASE(STSA-Net)
Zero-Shot Learning	Assembly101	Actions Top-1	28.03	CHASE(CTR-GCN)
Zero-Shot Learning	H2O (2 Hands and Objects)	Accuracy	94.77	CHASE(STSA-Net)
Activity Recognition	Assembly101	Actions Top-1	28.03	CHASE(CTR-GCN)
Activity Recognition	H2O (2 Hands and Objects)	Accuracy	94.77	CHASE(STSA-Net)
Activity Recognition	Collective Activity	Accuracy	89.61	CHASE(CTR-GCN)
Activity Recognition	Volleyball	Accuracy	92.89	CHASE(CTR-GCN)
Action Localization	Assembly101	Actions Top-1	28.03	CHASE(CTR-GCN)
Action Localization	H2O (2 Hands and Objects)	Accuracy	94.77	CHASE(STSA-Net)
Action Detection	H2O (2 Hands and Objects)	Accuracy	94.77	CHASE(STSA-Net)
Human Interaction Recognition	NTU RGB+D	Accuracy (Cross-Subject)	96.5	CHASE(CTR-GCN)
Human Interaction Recognition	NTU RGB+D	Accuracy (Cross-View)	98.8	CHASE(CTR-GCN)
Human Interaction Recognition	NTU RGB+D 120	Accuracy (Cross-Setup)	92.3	CHASE(CTR-GCN)
Human Interaction Recognition	NTU RGB+D 120	Accuracy (Cross-Subject)	91.3	CHASE(CTR-GCN)
3D Action Recognition	Assembly101	Actions Top-1	28.03	CHASE(CTR-GCN)
3D Action Recognition	H2O (2 Hands and Objects)	Accuracy	94.77	CHASE(STSA-Net)
Action Recognition	Assembly101	Actions Top-1	28.03	CHASE(CTR-GCN)
Action Recognition	H2O (2 Hands and Objects)	Accuracy	94.77	CHASE(STSA-Net)

CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition

Abstract

Results

Related Papers

CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition

Abstract

Results

Related Papers