TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Continual Spatio-Temporal Graph Convolutional Networks

Continual Spatio-Temporal Graph Convolutional Networks

Lukas Hedegaard, Negar Heidari, Alexandros Iosifidis

2022-03-21Skeleton Based Action RecognitionAction RecognitionTemporal Action LocalizationTemporal Sequences
PaperPDFCode(official)

Abstract

Graph-based reasoning over skeleton data has emerged as a promising approach for human action recognition. However, the application of prior graph-based methods, which predominantly employ whole temporal sequences as their input, to the setting of online inference entails considerable computational redundancy. In this paper, we tackle this issue by reformulating the Spatio-Temporal Graph Convolutional Neural Network as a Continual Inference Network, which can perform step-by-step predictions in time without repeat frame processing. To evaluate our method, we create a continual version of ST-GCN, CoST-GCN, alongside two derived methods with different self-attention mechanisms, CoAGCN and CoS-TR. We investigate weight transfer strategies and architectural modifications for inference acceleration, and perform experiments on the NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton 400 datasets. Retaining similar predictive accuracy, we observe up to 109x reduction in time complexity, on-hardware accelerations of 26x, and reductions in maximum allocated memory of 52% during online inference.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+D 120Accuracy (Cross-Setup)86.2S-TR (2-stream)
VideoNTU RGB+D 120Accuracy (Cross-Subject)84.8S-TR (2-stream)
VideoNTU RGB+D 120GFLOPS per prediction32.4S-TR (2-stream)
VideoNTU RGB+D 120Accuracy (Cross-Setup)86.1CoS-TR* (2-stream)
VideoNTU RGB+D 120Accuracy (Cross-Subject)84.8CoS-TR* (2-stream)
VideoNTU RGB+D 120GFLOPS per prediction0.3CoS-TR* (2-stream)
VideoNTU RGB+D 120Accuracy (Cross-Setup)85.5CoST-GCN* (2-stream)
VideoNTU RGB+D 120Accuracy (Cross-Subject)84CoST-GCN* (2-stream)
VideoNTU RGB+D 120GFLOPS per prediction0.32CoST-GCN* (2-stream)
VideoNTU RGB+D 120Accuracy (Cross-Setup)85.4AGCN (2-stream)
VideoNTU RGB+D 120Accuracy (Cross-Subject)84AGCN (2-stream)
VideoNTU RGB+D 120GFLOPS per prediction37.38AGCN (2-stream)
VideoNTU RGB+D 120Accuracy (Cross-Setup)85.1ST-GCN (2-stream)
VideoNTU RGB+D 120Accuracy (Cross-Subject)83.7ST-GCN (2-stream)
VideoNTU RGB+D 120GFLOPS per prediction33.46ST-GCN (2-stream)
VideoNTU RGB+D 120Accuracy (Cross-Setup)82CoAGCN* (2-stream)
VideoNTU RGB+D 120Accuracy (Cross-Subject)80.4CoAGCN* (2-stream)
VideoNTU RGB+D 120GFLOPS per prediction0.44CoAGCN* (2-stream)
VideoNTU RGB+D 120Accuracy (Cross-Setup)81.8S-TR (1-stream)
VideoNTU RGB+D 120Accuracy (Cross-Subject)80.2S-TR (1-stream)
VideoNTU RGB+D 120GFLOPS per prediction16.2S-TR (1-stream)
VideoNTU RGB+D 120Accuracy (Cross-Setup)81.7CoS-TR* (1-stream)
VideoNTU RGB+D 120Accuracy (Cross-Subject)79.7CoS-TR* (1-stream)
VideoNTU RGB+D 120GFLOPS per prediction0.15CoS-TR* (1-stream)
VideoNTU RGB+D 120Accuracy (Cross-Setup)80.7AGCN (1-stream)
VideoNTU RGB+D 120Accuracy (Cross-Subject)79.7AGCN (1-stream)
VideoNTU RGB+D 120GFLOPS per prediction18.69AGCN (1-stream)
VideoNTU RGB+D 120Accuracy (Cross-Setup)81.6CoST-GCN* (1-stream)
VideoNTU RGB+D 120Accuracy (Cross-Subject)79.4CoST-GCN* (1-stream)
VideoNTU RGB+D 120GFLOPS per prediction0.16CoST-GCN* (1-stream)
VideoNTU RGB+D 120Accuracy (Cross-Subject)79ST-GCN (1-stream)
VideoNTU RGB+D 120GFLOPS per prediction16.73ST-GCN (1-stream)
VideoNTU RGB+D 120Accuracy (Cross-Setup)79.1CoAGCN* (1-stream)
VideoNTU RGB+D 120Accuracy (Cross-Subject)77.3CoAGCN* (1-stream)
VideoNTU RGB+D 120GFLOPS per prediction0.22CoAGCN* (1-stream)
VideoKinetics-Skeleton datasetAccuracy36.9AGCN (2-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction26.91AGCN (2-stream)
VideoKinetics-Skeleton datasetAccuracy35AGCN (1-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction13.45AGCN (1-stream)
VideoKinetics-Skeleton datasetAccuracy34.7S-TR (2-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction23.24S-TR (2-stream)
VideoKinetics-Skeleton datasetAccuracy34.4ST-GCN (2-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction24.09ST-GCN (2-stream)
VideoKinetics-Skeleton datasetAccuracy33.4ST-GCN (1-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction12.04ST-GCN (1-stream)
VideoKinetics-Skeleton datasetAccuracy33.1CoST-GCN (2-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction0.32CoST-GCN (2-stream)
VideoKinetics-Skeleton datasetAccuracy33CoAGCN (1-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction0.18CoAGCN (1-stream)
VideoKinetics-Skeleton datasetAccuracy32.7CoS-TR (2-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction0.31CoS-TR (2-stream)
VideoKinetics-Skeleton datasetAccuracy32.2CoST-GCN* (2-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction0.22CoST-GCN* (2-stream)
VideoKinetics-Skeleton datasetAccuracy32S-TR (1-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction11.62S-TR (1-stream)
VideoKinetics-Skeleton datasetAccuracy31.8CoST-GCN (1-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction0.16CoST-GCN (1-stream)
VideoKinetics-Skeleton datasetAccuracy30.2CoST-GCN* (1-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction0.11CoST-GCN* (1-stream)
VideoKinetics-Skeleton datasetAccuracy29.9CoS-TR* (2-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction0.22CoS-TR* (2-stream)
VideoKinetics-Skeleton datasetAccuracy29.7CoS-TR (1-stream)
VideoKinetics-Skeleton datasetAccuracy27.5CoAGCN* (2-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction0.25CoAGCN* (2-stream)
VideoKinetics-Skeleton datasetAccuracy27.4CoS-TR* (1-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction0.11CoS-TR* (1-stream)
VideoKinetics-Skeleton datasetAccuracy23.3CoAGCN* (1-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction0.12CoAGCN* (1-stream)
VideoKinetics-Skeleton datasetGFLOPS per prediction0.36CoAGCN (2-stream)
VideoNTU RGB+DAccuracy (CS)88.9CoS-TR* (2-stream)
VideoNTU RGB+DAccuracy (CV)94.8CoS-TR* (2-stream)
VideoNTU RGB+DGFLOPs per pred0.3CoS-TR* (2-stream)
VideoNTU RGB+DAccuracy (CS)88.3CoST-GCN* (2-stream)
VideoNTU RGB+DAccuracy (CV)95CoST-GCN* (2-stream)
VideoNTU RGB+DGFLOPs per pred0.32CoST-GCN* (2-stream)
VideoNTU RGB+DAccuracy (CS)86.3CoST-GCN*
VideoNTU RGB+DAccuracy (CV)93.8CoST-GCN*
VideoNTU RGB+DGFLOPs per pred0.16CoST-GCN*
VideoNTU RGB+DAccuracy (CS)86.3CoS-TR*
VideoNTU RGB+DAccuracy (CV)92.4CoS-TR*
VideoNTU RGB+DGFLOPs per pred0.15CoS-TR*
VideoNTU RGB+DAccuracy (CS)86ST-GCN
VideoNTU RGB+DAccuracy (CV)93.4ST-GCN
VideoNTU RGB+DGFLOPs per pred16.73ST-GCN
VideoNTU RGB+DAccuracy (CS)86CoAGCN* (2-stream)
VideoNTU RGB+DAccuracy (CV)93.1CoAGCN* (2-stream)
VideoNTU RGB+DGFLOPs per pred0.44CoAGCN* (2-stream)
VideoNTU RGB+DAccuracy (CS)84.1CoAGCN*
VideoNTU RGB+DAccuracy (CV)92.6CoAGCN*
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)86.2S-TR (2-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)84.8S-TR (2-stream)
Temporal Action LocalizationNTU RGB+D 120GFLOPS per prediction32.4S-TR (2-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)86.1CoS-TR* (2-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)84.8CoS-TR* (2-stream)
Temporal Action LocalizationNTU RGB+D 120GFLOPS per prediction0.3CoS-TR* (2-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)85.5CoST-GCN* (2-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)84CoST-GCN* (2-stream)
Temporal Action LocalizationNTU RGB+D 120GFLOPS per prediction0.32CoST-GCN* (2-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)85.4AGCN (2-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)84AGCN (2-stream)
Temporal Action LocalizationNTU RGB+D 120GFLOPS per prediction37.38AGCN (2-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)85.1ST-GCN (2-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)83.7ST-GCN (2-stream)
Temporal Action LocalizationNTU RGB+D 120GFLOPS per prediction33.46ST-GCN (2-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)82CoAGCN* (2-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)80.4CoAGCN* (2-stream)
Temporal Action LocalizationNTU RGB+D 120GFLOPS per prediction0.44CoAGCN* (2-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)81.8S-TR (1-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)80.2S-TR (1-stream)
Temporal Action LocalizationNTU RGB+D 120GFLOPS per prediction16.2S-TR (1-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)81.7CoS-TR* (1-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)79.7CoS-TR* (1-stream)
Temporal Action LocalizationNTU RGB+D 120GFLOPS per prediction0.15CoS-TR* (1-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)80.7AGCN (1-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)79.7AGCN (1-stream)
Temporal Action LocalizationNTU RGB+D 120GFLOPS per prediction18.69AGCN (1-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)81.6CoST-GCN* (1-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)79.4CoST-GCN* (1-stream)
Temporal Action LocalizationNTU RGB+D 120GFLOPS per prediction0.16CoST-GCN* (1-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)79ST-GCN (1-stream)
Temporal Action LocalizationNTU RGB+D 120GFLOPS per prediction16.73ST-GCN (1-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)79.1CoAGCN* (1-stream)
Temporal Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)77.3CoAGCN* (1-stream)
Temporal Action LocalizationNTU RGB+D 120GFLOPS per prediction0.22CoAGCN* (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy36.9AGCN (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction26.91AGCN (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy35AGCN (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction13.45AGCN (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy34.7S-TR (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction23.24S-TR (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy34.4ST-GCN (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction24.09ST-GCN (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy33.4ST-GCN (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction12.04ST-GCN (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy33.1CoST-GCN (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.32CoST-GCN (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy33CoAGCN (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.18CoAGCN (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy32.7CoS-TR (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.31CoS-TR (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy32.2CoST-GCN* (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.22CoST-GCN* (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy32S-TR (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction11.62S-TR (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy31.8CoST-GCN (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.16CoST-GCN (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy30.2CoST-GCN* (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.11CoST-GCN* (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy29.9CoS-TR* (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.22CoS-TR* (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy29.7CoS-TR (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy27.5CoAGCN* (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.25CoAGCN* (2-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy27.4CoS-TR* (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.11CoS-TR* (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetAccuracy23.3CoAGCN* (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.12CoAGCN* (1-stream)
Temporal Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.36CoAGCN (2-stream)
Temporal Action LocalizationNTU RGB+DAccuracy (CS)88.9CoS-TR* (2-stream)
Temporal Action LocalizationNTU RGB+DAccuracy (CV)94.8CoS-TR* (2-stream)
Temporal Action LocalizationNTU RGB+DGFLOPs per pred0.3CoS-TR* (2-stream)
Temporal Action LocalizationNTU RGB+DAccuracy (CS)88.3CoST-GCN* (2-stream)
Temporal Action LocalizationNTU RGB+DAccuracy (CV)95CoST-GCN* (2-stream)
Temporal Action LocalizationNTU RGB+DGFLOPs per pred0.32CoST-GCN* (2-stream)
Temporal Action LocalizationNTU RGB+DAccuracy (CS)86.3CoST-GCN*
Temporal Action LocalizationNTU RGB+DAccuracy (CV)93.8CoST-GCN*
Temporal Action LocalizationNTU RGB+DGFLOPs per pred0.16CoST-GCN*
Temporal Action LocalizationNTU RGB+DAccuracy (CS)86.3CoS-TR*
Temporal Action LocalizationNTU RGB+DAccuracy (CV)92.4CoS-TR*
Temporal Action LocalizationNTU RGB+DGFLOPs per pred0.15CoS-TR*
Temporal Action LocalizationNTU RGB+DAccuracy (CS)86ST-GCN
Temporal Action LocalizationNTU RGB+DAccuracy (CV)93.4ST-GCN
Temporal Action LocalizationNTU RGB+DGFLOPs per pred16.73ST-GCN
Temporal Action LocalizationNTU RGB+DAccuracy (CS)86CoAGCN* (2-stream)
Temporal Action LocalizationNTU RGB+DAccuracy (CV)93.1CoAGCN* (2-stream)
Temporal Action LocalizationNTU RGB+DGFLOPs per pred0.44CoAGCN* (2-stream)
Temporal Action LocalizationNTU RGB+DAccuracy (CS)84.1CoAGCN*
Temporal Action LocalizationNTU RGB+DAccuracy (CV)92.6CoAGCN*
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)86.2S-TR (2-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)84.8S-TR (2-stream)
Zero-Shot LearningNTU RGB+D 120GFLOPS per prediction32.4S-TR (2-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)86.1CoS-TR* (2-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)84.8CoS-TR* (2-stream)
Zero-Shot LearningNTU RGB+D 120GFLOPS per prediction0.3CoS-TR* (2-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)85.5CoST-GCN* (2-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)84CoST-GCN* (2-stream)
Zero-Shot LearningNTU RGB+D 120GFLOPS per prediction0.32CoST-GCN* (2-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)85.4AGCN (2-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)84AGCN (2-stream)
Zero-Shot LearningNTU RGB+D 120GFLOPS per prediction37.38AGCN (2-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)85.1ST-GCN (2-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)83.7ST-GCN (2-stream)
Zero-Shot LearningNTU RGB+D 120GFLOPS per prediction33.46ST-GCN (2-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)82CoAGCN* (2-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)80.4CoAGCN* (2-stream)
Zero-Shot LearningNTU RGB+D 120GFLOPS per prediction0.44CoAGCN* (2-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)81.8S-TR (1-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)80.2S-TR (1-stream)
Zero-Shot LearningNTU RGB+D 120GFLOPS per prediction16.2S-TR (1-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)81.7CoS-TR* (1-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)79.7CoS-TR* (1-stream)
Zero-Shot LearningNTU RGB+D 120GFLOPS per prediction0.15CoS-TR* (1-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)80.7AGCN (1-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)79.7AGCN (1-stream)
Zero-Shot LearningNTU RGB+D 120GFLOPS per prediction18.69AGCN (1-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)81.6CoST-GCN* (1-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)79.4CoST-GCN* (1-stream)
Zero-Shot LearningNTU RGB+D 120GFLOPS per prediction0.16CoST-GCN* (1-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)79ST-GCN (1-stream)
Zero-Shot LearningNTU RGB+D 120GFLOPS per prediction16.73ST-GCN (1-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Setup)79.1CoAGCN* (1-stream)
Zero-Shot LearningNTU RGB+D 120Accuracy (Cross-Subject)77.3CoAGCN* (1-stream)
Zero-Shot LearningNTU RGB+D 120GFLOPS per prediction0.22CoAGCN* (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy36.9AGCN (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction26.91AGCN (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy35AGCN (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction13.45AGCN (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy34.7S-TR (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction23.24S-TR (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy34.4ST-GCN (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction24.09ST-GCN (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy33.4ST-GCN (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction12.04ST-GCN (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy33.1CoST-GCN (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction0.32CoST-GCN (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy33CoAGCN (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction0.18CoAGCN (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy32.7CoS-TR (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction0.31CoS-TR (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy32.2CoST-GCN* (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction0.22CoST-GCN* (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy32S-TR (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction11.62S-TR (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy31.8CoST-GCN (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction0.16CoST-GCN (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy30.2CoST-GCN* (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction0.11CoST-GCN* (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy29.9CoS-TR* (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction0.22CoS-TR* (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy29.7CoS-TR (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy27.5CoAGCN* (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction0.25CoAGCN* (2-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy27.4CoS-TR* (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction0.11CoS-TR* (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetAccuracy23.3CoAGCN* (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction0.12CoAGCN* (1-stream)
Zero-Shot LearningKinetics-Skeleton datasetGFLOPS per prediction0.36CoAGCN (2-stream)
Zero-Shot LearningNTU RGB+DAccuracy (CS)88.9CoS-TR* (2-stream)
Zero-Shot LearningNTU RGB+DAccuracy (CV)94.8CoS-TR* (2-stream)
Zero-Shot LearningNTU RGB+DGFLOPs per pred0.3CoS-TR* (2-stream)
Zero-Shot LearningNTU RGB+DAccuracy (CS)88.3CoST-GCN* (2-stream)
Zero-Shot LearningNTU RGB+DAccuracy (CV)95CoST-GCN* (2-stream)
Zero-Shot LearningNTU RGB+DGFLOPs per pred0.32CoST-GCN* (2-stream)
Zero-Shot LearningNTU RGB+DAccuracy (CS)86.3CoST-GCN*
Zero-Shot LearningNTU RGB+DAccuracy (CV)93.8CoST-GCN*
Zero-Shot LearningNTU RGB+DGFLOPs per pred0.16CoST-GCN*
Zero-Shot LearningNTU RGB+DAccuracy (CS)86.3CoS-TR*
Zero-Shot LearningNTU RGB+DAccuracy (CV)92.4CoS-TR*
Zero-Shot LearningNTU RGB+DGFLOPs per pred0.15CoS-TR*
Zero-Shot LearningNTU RGB+DAccuracy (CS)86ST-GCN
Zero-Shot LearningNTU RGB+DAccuracy (CV)93.4ST-GCN
Zero-Shot LearningNTU RGB+DGFLOPs per pred16.73ST-GCN
Zero-Shot LearningNTU RGB+DAccuracy (CS)86CoAGCN* (2-stream)
Zero-Shot LearningNTU RGB+DAccuracy (CV)93.1CoAGCN* (2-stream)
Zero-Shot LearningNTU RGB+DGFLOPs per pred0.44CoAGCN* (2-stream)
Zero-Shot LearningNTU RGB+DAccuracy (CS)84.1CoAGCN*
Zero-Shot LearningNTU RGB+DAccuracy (CV)92.6CoAGCN*
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)86.2S-TR (2-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)84.8S-TR (2-stream)
Activity RecognitionNTU RGB+D 120GFLOPS per prediction32.4S-TR (2-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)86.1CoS-TR* (2-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)84.8CoS-TR* (2-stream)
Activity RecognitionNTU RGB+D 120GFLOPS per prediction0.3CoS-TR* (2-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)85.5CoST-GCN* (2-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)84CoST-GCN* (2-stream)
Activity RecognitionNTU RGB+D 120GFLOPS per prediction0.32CoST-GCN* (2-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)85.4AGCN (2-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)84AGCN (2-stream)
Activity RecognitionNTU RGB+D 120GFLOPS per prediction37.38AGCN (2-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)85.1ST-GCN (2-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)83.7ST-GCN (2-stream)
Activity RecognitionNTU RGB+D 120GFLOPS per prediction33.46ST-GCN (2-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)82CoAGCN* (2-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)80.4CoAGCN* (2-stream)
Activity RecognitionNTU RGB+D 120GFLOPS per prediction0.44CoAGCN* (2-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)81.8S-TR (1-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)80.2S-TR (1-stream)
Activity RecognitionNTU RGB+D 120GFLOPS per prediction16.2S-TR (1-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)81.7CoS-TR* (1-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)79.7CoS-TR* (1-stream)
Activity RecognitionNTU RGB+D 120GFLOPS per prediction0.15CoS-TR* (1-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)80.7AGCN (1-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)79.7AGCN (1-stream)
Activity RecognitionNTU RGB+D 120GFLOPS per prediction18.69AGCN (1-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)81.6CoST-GCN* (1-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)79.4CoST-GCN* (1-stream)
Activity RecognitionNTU RGB+D 120GFLOPS per prediction0.16CoST-GCN* (1-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)79ST-GCN (1-stream)
Activity RecognitionNTU RGB+D 120GFLOPS per prediction16.73ST-GCN (1-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Setup)79.1CoAGCN* (1-stream)
Activity RecognitionNTU RGB+D 120Accuracy (Cross-Subject)77.3CoAGCN* (1-stream)
Activity RecognitionNTU RGB+D 120GFLOPS per prediction0.22CoAGCN* (1-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy36.9AGCN (2-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction26.91AGCN (2-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy35AGCN (1-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction13.45AGCN (1-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy34.7S-TR (2-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction23.24S-TR (2-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy34.4ST-GCN (2-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction24.09ST-GCN (2-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy33.4ST-GCN (1-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction12.04ST-GCN (1-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy33.1CoST-GCN (2-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.32CoST-GCN (2-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy33CoAGCN (1-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.18CoAGCN (1-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy32.7CoS-TR (2-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.31CoS-TR (2-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy32.2CoST-GCN* (2-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.22CoST-GCN* (2-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy32S-TR (1-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction11.62S-TR (1-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy31.8CoST-GCN (1-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.16CoST-GCN (1-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy30.2CoST-GCN* (1-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.11CoST-GCN* (1-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy29.9CoS-TR* (2-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.22CoS-TR* (2-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy29.7CoS-TR (1-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy27.5CoAGCN* (2-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.25CoAGCN* (2-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy27.4CoS-TR* (1-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.11CoS-TR* (1-stream)
Activity RecognitionKinetics-Skeleton datasetAccuracy23.3CoAGCN* (1-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.12CoAGCN* (1-stream)
Activity RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.36CoAGCN (2-stream)
Activity RecognitionNTU RGB+DAccuracy (CS)88.9CoS-TR* (2-stream)
Activity RecognitionNTU RGB+DAccuracy (CV)94.8CoS-TR* (2-stream)
Activity RecognitionNTU RGB+DGFLOPs per pred0.3CoS-TR* (2-stream)
Activity RecognitionNTU RGB+DAccuracy (CS)88.3CoST-GCN* (2-stream)
Activity RecognitionNTU RGB+DAccuracy (CV)95CoST-GCN* (2-stream)
Activity RecognitionNTU RGB+DGFLOPs per pred0.32CoST-GCN* (2-stream)
Activity RecognitionNTU RGB+DAccuracy (CS)86.3CoST-GCN*
Activity RecognitionNTU RGB+DAccuracy (CV)93.8CoST-GCN*
Activity RecognitionNTU RGB+DGFLOPs per pred0.16CoST-GCN*
Activity RecognitionNTU RGB+DAccuracy (CS)86.3CoS-TR*
Activity RecognitionNTU RGB+DAccuracy (CV)92.4CoS-TR*
Activity RecognitionNTU RGB+DGFLOPs per pred0.15CoS-TR*
Activity RecognitionNTU RGB+DAccuracy (CS)86ST-GCN
Activity RecognitionNTU RGB+DAccuracy (CV)93.4ST-GCN
Activity RecognitionNTU RGB+DGFLOPs per pred16.73ST-GCN
Activity RecognitionNTU RGB+DAccuracy (CS)86CoAGCN* (2-stream)
Activity RecognitionNTU RGB+DAccuracy (CV)93.1CoAGCN* (2-stream)
Activity RecognitionNTU RGB+DGFLOPs per pred0.44CoAGCN* (2-stream)
Activity RecognitionNTU RGB+DAccuracy (CS)84.1CoAGCN*
Activity RecognitionNTU RGB+DAccuracy (CV)92.6CoAGCN*
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)86.2S-TR (2-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)84.8S-TR (2-stream)
Action LocalizationNTU RGB+D 120GFLOPS per prediction32.4S-TR (2-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)86.1CoS-TR* (2-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)84.8CoS-TR* (2-stream)
Action LocalizationNTU RGB+D 120GFLOPS per prediction0.3CoS-TR* (2-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)85.5CoST-GCN* (2-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)84CoST-GCN* (2-stream)
Action LocalizationNTU RGB+D 120GFLOPS per prediction0.32CoST-GCN* (2-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)85.4AGCN (2-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)84AGCN (2-stream)
Action LocalizationNTU RGB+D 120GFLOPS per prediction37.38AGCN (2-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)85.1ST-GCN (2-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)83.7ST-GCN (2-stream)
Action LocalizationNTU RGB+D 120GFLOPS per prediction33.46ST-GCN (2-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)82CoAGCN* (2-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)80.4CoAGCN* (2-stream)
Action LocalizationNTU RGB+D 120GFLOPS per prediction0.44CoAGCN* (2-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)81.8S-TR (1-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)80.2S-TR (1-stream)
Action LocalizationNTU RGB+D 120GFLOPS per prediction16.2S-TR (1-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)81.7CoS-TR* (1-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)79.7CoS-TR* (1-stream)
Action LocalizationNTU RGB+D 120GFLOPS per prediction0.15CoS-TR* (1-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)80.7AGCN (1-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)79.7AGCN (1-stream)
Action LocalizationNTU RGB+D 120GFLOPS per prediction18.69AGCN (1-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)81.6CoST-GCN* (1-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)79.4CoST-GCN* (1-stream)
Action LocalizationNTU RGB+D 120GFLOPS per prediction0.16CoST-GCN* (1-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)79ST-GCN (1-stream)
Action LocalizationNTU RGB+D 120GFLOPS per prediction16.73ST-GCN (1-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Setup)79.1CoAGCN* (1-stream)
Action LocalizationNTU RGB+D 120Accuracy (Cross-Subject)77.3CoAGCN* (1-stream)
Action LocalizationNTU RGB+D 120GFLOPS per prediction0.22CoAGCN* (1-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy36.9AGCN (2-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction26.91AGCN (2-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy35AGCN (1-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction13.45AGCN (1-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy34.7S-TR (2-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction23.24S-TR (2-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy34.4ST-GCN (2-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction24.09ST-GCN (2-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy33.4ST-GCN (1-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction12.04ST-GCN (1-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy33.1CoST-GCN (2-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.32CoST-GCN (2-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy33CoAGCN (1-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.18CoAGCN (1-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy32.7CoS-TR (2-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.31CoS-TR (2-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy32.2CoST-GCN* (2-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.22CoST-GCN* (2-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy32S-TR (1-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction11.62S-TR (1-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy31.8CoST-GCN (1-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.16CoST-GCN (1-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy30.2CoST-GCN* (1-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.11CoST-GCN* (1-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy29.9CoS-TR* (2-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.22CoS-TR* (2-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy29.7CoS-TR (1-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy27.5CoAGCN* (2-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.25CoAGCN* (2-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy27.4CoS-TR* (1-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.11CoS-TR* (1-stream)
Action LocalizationKinetics-Skeleton datasetAccuracy23.3CoAGCN* (1-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.12CoAGCN* (1-stream)
Action LocalizationKinetics-Skeleton datasetGFLOPS per prediction0.36CoAGCN (2-stream)
Action LocalizationNTU RGB+DAccuracy (CS)88.9CoS-TR* (2-stream)
Action LocalizationNTU RGB+DAccuracy (CV)94.8CoS-TR* (2-stream)
Action LocalizationNTU RGB+DGFLOPs per pred0.3CoS-TR* (2-stream)
Action LocalizationNTU RGB+DAccuracy (CS)88.3CoST-GCN* (2-stream)
Action LocalizationNTU RGB+DAccuracy (CV)95CoST-GCN* (2-stream)
Action LocalizationNTU RGB+DGFLOPs per pred0.32CoST-GCN* (2-stream)
Action LocalizationNTU RGB+DAccuracy (CS)86.3CoST-GCN*
Action LocalizationNTU RGB+DAccuracy (CV)93.8CoST-GCN*
Action LocalizationNTU RGB+DGFLOPs per pred0.16CoST-GCN*
Action LocalizationNTU RGB+DAccuracy (CS)86.3CoS-TR*
Action LocalizationNTU RGB+DAccuracy (CV)92.4CoS-TR*
Action LocalizationNTU RGB+DGFLOPs per pred0.15CoS-TR*
Action LocalizationNTU RGB+DAccuracy (CS)86ST-GCN
Action LocalizationNTU RGB+DAccuracy (CV)93.4ST-GCN
Action LocalizationNTU RGB+DGFLOPs per pred16.73ST-GCN
Action LocalizationNTU RGB+DAccuracy (CS)86CoAGCN* (2-stream)
Action LocalizationNTU RGB+DAccuracy (CV)93.1CoAGCN* (2-stream)
Action LocalizationNTU RGB+DGFLOPs per pred0.44CoAGCN* (2-stream)
Action LocalizationNTU RGB+DAccuracy (CS)84.1CoAGCN*
Action LocalizationNTU RGB+DAccuracy (CV)92.6CoAGCN*
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)86.2S-TR (2-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)84.8S-TR (2-stream)
Action DetectionNTU RGB+D 120GFLOPS per prediction32.4S-TR (2-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)86.1CoS-TR* (2-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)84.8CoS-TR* (2-stream)
Action DetectionNTU RGB+D 120GFLOPS per prediction0.3CoS-TR* (2-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)85.5CoST-GCN* (2-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)84CoST-GCN* (2-stream)
Action DetectionNTU RGB+D 120GFLOPS per prediction0.32CoST-GCN* (2-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)85.4AGCN (2-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)84AGCN (2-stream)
Action DetectionNTU RGB+D 120GFLOPS per prediction37.38AGCN (2-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)85.1ST-GCN (2-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)83.7ST-GCN (2-stream)
Action DetectionNTU RGB+D 120GFLOPS per prediction33.46ST-GCN (2-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)82CoAGCN* (2-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)80.4CoAGCN* (2-stream)
Action DetectionNTU RGB+D 120GFLOPS per prediction0.44CoAGCN* (2-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)81.8S-TR (1-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)80.2S-TR (1-stream)
Action DetectionNTU RGB+D 120GFLOPS per prediction16.2S-TR (1-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)81.7CoS-TR* (1-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)79.7CoS-TR* (1-stream)
Action DetectionNTU RGB+D 120GFLOPS per prediction0.15CoS-TR* (1-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)80.7AGCN (1-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)79.7AGCN (1-stream)
Action DetectionNTU RGB+D 120GFLOPS per prediction18.69AGCN (1-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)81.6CoST-GCN* (1-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)79.4CoST-GCN* (1-stream)
Action DetectionNTU RGB+D 120GFLOPS per prediction0.16CoST-GCN* (1-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)79ST-GCN (1-stream)
Action DetectionNTU RGB+D 120GFLOPS per prediction16.73ST-GCN (1-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Setup)79.1CoAGCN* (1-stream)
Action DetectionNTU RGB+D 120Accuracy (Cross-Subject)77.3CoAGCN* (1-stream)
Action DetectionNTU RGB+D 120GFLOPS per prediction0.22CoAGCN* (1-stream)
Action DetectionKinetics-Skeleton datasetAccuracy36.9AGCN (2-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction26.91AGCN (2-stream)
Action DetectionKinetics-Skeleton datasetAccuracy35AGCN (1-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction13.45AGCN (1-stream)
Action DetectionKinetics-Skeleton datasetAccuracy34.7S-TR (2-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction23.24S-TR (2-stream)
Action DetectionKinetics-Skeleton datasetAccuracy34.4ST-GCN (2-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction24.09ST-GCN (2-stream)
Action DetectionKinetics-Skeleton datasetAccuracy33.4ST-GCN (1-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction12.04ST-GCN (1-stream)
Action DetectionKinetics-Skeleton datasetAccuracy33.1CoST-GCN (2-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction0.32CoST-GCN (2-stream)
Action DetectionKinetics-Skeleton datasetAccuracy33CoAGCN (1-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction0.18CoAGCN (1-stream)
Action DetectionKinetics-Skeleton datasetAccuracy32.7CoS-TR (2-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction0.31CoS-TR (2-stream)
Action DetectionKinetics-Skeleton datasetAccuracy32.2CoST-GCN* (2-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction0.22CoST-GCN* (2-stream)
Action DetectionKinetics-Skeleton datasetAccuracy32S-TR (1-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction11.62S-TR (1-stream)
Action DetectionKinetics-Skeleton datasetAccuracy31.8CoST-GCN (1-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction0.16CoST-GCN (1-stream)
Action DetectionKinetics-Skeleton datasetAccuracy30.2CoST-GCN* (1-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction0.11CoST-GCN* (1-stream)
Action DetectionKinetics-Skeleton datasetAccuracy29.9CoS-TR* (2-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction0.22CoS-TR* (2-stream)
Action DetectionKinetics-Skeleton datasetAccuracy29.7CoS-TR (1-stream)
Action DetectionKinetics-Skeleton datasetAccuracy27.5CoAGCN* (2-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction0.25CoAGCN* (2-stream)
Action DetectionKinetics-Skeleton datasetAccuracy27.4CoS-TR* (1-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction0.11CoS-TR* (1-stream)
Action DetectionKinetics-Skeleton datasetAccuracy23.3CoAGCN* (1-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction0.12CoAGCN* (1-stream)
Action DetectionKinetics-Skeleton datasetGFLOPS per prediction0.36CoAGCN (2-stream)
Action DetectionNTU RGB+DAccuracy (CS)88.9CoS-TR* (2-stream)
Action DetectionNTU RGB+DAccuracy (CV)94.8CoS-TR* (2-stream)
Action DetectionNTU RGB+DGFLOPs per pred0.3CoS-TR* (2-stream)
Action DetectionNTU RGB+DAccuracy (CS)88.3CoST-GCN* (2-stream)
Action DetectionNTU RGB+DAccuracy (CV)95CoST-GCN* (2-stream)
Action DetectionNTU RGB+DGFLOPs per pred0.32CoST-GCN* (2-stream)
Action DetectionNTU RGB+DAccuracy (CS)86.3CoST-GCN*
Action DetectionNTU RGB+DAccuracy (CV)93.8CoST-GCN*
Action DetectionNTU RGB+DGFLOPs per pred0.16CoST-GCN*
Action DetectionNTU RGB+DAccuracy (CS)86.3CoS-TR*
Action DetectionNTU RGB+DAccuracy (CV)92.4CoS-TR*
Action DetectionNTU RGB+DGFLOPs per pred0.15CoS-TR*
Action DetectionNTU RGB+DAccuracy (CS)86ST-GCN
Action DetectionNTU RGB+DAccuracy (CV)93.4ST-GCN
Action DetectionNTU RGB+DGFLOPs per pred16.73ST-GCN
Action DetectionNTU RGB+DAccuracy (CS)86CoAGCN* (2-stream)
Action DetectionNTU RGB+DAccuracy (CV)93.1CoAGCN* (2-stream)
Action DetectionNTU RGB+DGFLOPs per pred0.44CoAGCN* (2-stream)
Action DetectionNTU RGB+DAccuracy (CS)84.1CoAGCN*
Action DetectionNTU RGB+DAccuracy (CV)92.6CoAGCN*
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)86.2S-TR (2-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)84.8S-TR (2-stream)
3D Action RecognitionNTU RGB+D 120GFLOPS per prediction32.4S-TR (2-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)86.1CoS-TR* (2-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)84.8CoS-TR* (2-stream)
3D Action RecognitionNTU RGB+D 120GFLOPS per prediction0.3CoS-TR* (2-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)85.5CoST-GCN* (2-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)84CoST-GCN* (2-stream)
3D Action RecognitionNTU RGB+D 120GFLOPS per prediction0.32CoST-GCN* (2-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)85.4AGCN (2-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)84AGCN (2-stream)
3D Action RecognitionNTU RGB+D 120GFLOPS per prediction37.38AGCN (2-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)85.1ST-GCN (2-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)83.7ST-GCN (2-stream)
3D Action RecognitionNTU RGB+D 120GFLOPS per prediction33.46ST-GCN (2-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)82CoAGCN* (2-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)80.4CoAGCN* (2-stream)
3D Action RecognitionNTU RGB+D 120GFLOPS per prediction0.44CoAGCN* (2-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)81.8S-TR (1-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)80.2S-TR (1-stream)
3D Action RecognitionNTU RGB+D 120GFLOPS per prediction16.2S-TR (1-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)81.7CoS-TR* (1-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)79.7CoS-TR* (1-stream)
3D Action RecognitionNTU RGB+D 120GFLOPS per prediction0.15CoS-TR* (1-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)80.7AGCN (1-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)79.7AGCN (1-stream)
3D Action RecognitionNTU RGB+D 120GFLOPS per prediction18.69AGCN (1-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)81.6CoST-GCN* (1-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)79.4CoST-GCN* (1-stream)
3D Action RecognitionNTU RGB+D 120GFLOPS per prediction0.16CoST-GCN* (1-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)79ST-GCN (1-stream)
3D Action RecognitionNTU RGB+D 120GFLOPS per prediction16.73ST-GCN (1-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)79.1CoAGCN* (1-stream)
3D Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)77.3CoAGCN* (1-stream)
3D Action RecognitionNTU RGB+D 120GFLOPS per prediction0.22CoAGCN* (1-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy36.9AGCN (2-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction26.91AGCN (2-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy35AGCN (1-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction13.45AGCN (1-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy34.7S-TR (2-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction23.24S-TR (2-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy34.4ST-GCN (2-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction24.09ST-GCN (2-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy33.4ST-GCN (1-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction12.04ST-GCN (1-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy33.1CoST-GCN (2-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.32CoST-GCN (2-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy33CoAGCN (1-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.18CoAGCN (1-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy32.7CoS-TR (2-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.31CoS-TR (2-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy32.2CoST-GCN* (2-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.22CoST-GCN* (2-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy32S-TR (1-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction11.62S-TR (1-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy31.8CoST-GCN (1-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.16CoST-GCN (1-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy30.2CoST-GCN* (1-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.11CoST-GCN* (1-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy29.9CoS-TR* (2-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.22CoS-TR* (2-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy29.7CoS-TR (1-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy27.5CoAGCN* (2-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.25CoAGCN* (2-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy27.4CoS-TR* (1-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.11CoS-TR* (1-stream)
3D Action RecognitionKinetics-Skeleton datasetAccuracy23.3CoAGCN* (1-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.12CoAGCN* (1-stream)
3D Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.36CoAGCN (2-stream)
3D Action RecognitionNTU RGB+DAccuracy (CS)88.9CoS-TR* (2-stream)
3D Action RecognitionNTU RGB+DAccuracy (CV)94.8CoS-TR* (2-stream)
3D Action RecognitionNTU RGB+DGFLOPs per pred0.3CoS-TR* (2-stream)
3D Action RecognitionNTU RGB+DAccuracy (CS)88.3CoST-GCN* (2-stream)
3D Action RecognitionNTU RGB+DAccuracy (CV)95CoST-GCN* (2-stream)
3D Action RecognitionNTU RGB+DGFLOPs per pred0.32CoST-GCN* (2-stream)
3D Action RecognitionNTU RGB+DAccuracy (CS)86.3CoST-GCN*
3D Action RecognitionNTU RGB+DAccuracy (CV)93.8CoST-GCN*
3D Action RecognitionNTU RGB+DGFLOPs per pred0.16CoST-GCN*
3D Action RecognitionNTU RGB+DAccuracy (CS)86.3CoS-TR*
3D Action RecognitionNTU RGB+DAccuracy (CV)92.4CoS-TR*
3D Action RecognitionNTU RGB+DGFLOPs per pred0.15CoS-TR*
3D Action RecognitionNTU RGB+DAccuracy (CS)86ST-GCN
3D Action RecognitionNTU RGB+DAccuracy (CV)93.4ST-GCN
3D Action RecognitionNTU RGB+DGFLOPs per pred16.73ST-GCN
3D Action RecognitionNTU RGB+DAccuracy (CS)86CoAGCN* (2-stream)
3D Action RecognitionNTU RGB+DAccuracy (CV)93.1CoAGCN* (2-stream)
3D Action RecognitionNTU RGB+DGFLOPs per pred0.44CoAGCN* (2-stream)
3D Action RecognitionNTU RGB+DAccuracy (CS)84.1CoAGCN*
3D Action RecognitionNTU RGB+DAccuracy (CV)92.6CoAGCN*
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)86.2S-TR (2-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)84.8S-TR (2-stream)
Action RecognitionNTU RGB+D 120GFLOPS per prediction32.4S-TR (2-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)86.1CoS-TR* (2-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)84.8CoS-TR* (2-stream)
Action RecognitionNTU RGB+D 120GFLOPS per prediction0.3CoS-TR* (2-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)85.5CoST-GCN* (2-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)84CoST-GCN* (2-stream)
Action RecognitionNTU RGB+D 120GFLOPS per prediction0.32CoST-GCN* (2-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)85.4AGCN (2-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)84AGCN (2-stream)
Action RecognitionNTU RGB+D 120GFLOPS per prediction37.38AGCN (2-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)85.1ST-GCN (2-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)83.7ST-GCN (2-stream)
Action RecognitionNTU RGB+D 120GFLOPS per prediction33.46ST-GCN (2-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)82CoAGCN* (2-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)80.4CoAGCN* (2-stream)
Action RecognitionNTU RGB+D 120GFLOPS per prediction0.44CoAGCN* (2-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)81.8S-TR (1-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)80.2S-TR (1-stream)
Action RecognitionNTU RGB+D 120GFLOPS per prediction16.2S-TR (1-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)81.7CoS-TR* (1-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)79.7CoS-TR* (1-stream)
Action RecognitionNTU RGB+D 120GFLOPS per prediction0.15CoS-TR* (1-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)80.7AGCN (1-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)79.7AGCN (1-stream)
Action RecognitionNTU RGB+D 120GFLOPS per prediction18.69AGCN (1-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)81.6CoST-GCN* (1-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)79.4CoST-GCN* (1-stream)
Action RecognitionNTU RGB+D 120GFLOPS per prediction0.16CoST-GCN* (1-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)79ST-GCN (1-stream)
Action RecognitionNTU RGB+D 120GFLOPS per prediction16.73ST-GCN (1-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Setup)79.1CoAGCN* (1-stream)
Action RecognitionNTU RGB+D 120Accuracy (Cross-Subject)77.3CoAGCN* (1-stream)
Action RecognitionNTU RGB+D 120GFLOPS per prediction0.22CoAGCN* (1-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy36.9AGCN (2-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction26.91AGCN (2-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy35AGCN (1-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction13.45AGCN (1-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy34.7S-TR (2-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction23.24S-TR (2-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy34.4ST-GCN (2-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction24.09ST-GCN (2-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy33.4ST-GCN (1-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction12.04ST-GCN (1-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy33.1CoST-GCN (2-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.32CoST-GCN (2-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy33CoAGCN (1-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.18CoAGCN (1-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy32.7CoS-TR (2-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.31CoS-TR (2-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy32.2CoST-GCN* (2-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.22CoST-GCN* (2-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy32S-TR (1-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction11.62S-TR (1-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy31.8CoST-GCN (1-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.16CoST-GCN (1-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy30.2CoST-GCN* (1-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.11CoST-GCN* (1-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy29.9CoS-TR* (2-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.22CoS-TR* (2-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy29.7CoS-TR (1-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy27.5CoAGCN* (2-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.25CoAGCN* (2-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy27.4CoS-TR* (1-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.11CoS-TR* (1-stream)
Action RecognitionKinetics-Skeleton datasetAccuracy23.3CoAGCN* (1-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.12CoAGCN* (1-stream)
Action RecognitionKinetics-Skeleton datasetGFLOPS per prediction0.36CoAGCN (2-stream)
Action RecognitionNTU RGB+DAccuracy (CS)88.9CoS-TR* (2-stream)
Action RecognitionNTU RGB+DAccuracy (CV)94.8CoS-TR* (2-stream)
Action RecognitionNTU RGB+DGFLOPs per pred0.3CoS-TR* (2-stream)
Action RecognitionNTU RGB+DAccuracy (CS)88.3CoST-GCN* (2-stream)
Action RecognitionNTU RGB+DAccuracy (CV)95CoST-GCN* (2-stream)
Action RecognitionNTU RGB+DGFLOPs per pred0.32CoST-GCN* (2-stream)
Action RecognitionNTU RGB+DAccuracy (CS)86.3CoST-GCN*
Action RecognitionNTU RGB+DAccuracy (CV)93.8CoST-GCN*
Action RecognitionNTU RGB+DGFLOPs per pred0.16CoST-GCN*
Action RecognitionNTU RGB+DAccuracy (CS)86.3CoS-TR*
Action RecognitionNTU RGB+DAccuracy (CV)92.4CoS-TR*
Action RecognitionNTU RGB+DGFLOPs per pred0.15CoS-TR*
Action RecognitionNTU RGB+DAccuracy (CS)86ST-GCN
Action RecognitionNTU RGB+DAccuracy (CV)93.4ST-GCN
Action RecognitionNTU RGB+DGFLOPs per pred16.73ST-GCN
Action RecognitionNTU RGB+DAccuracy (CS)86CoAGCN* (2-stream)
Action RecognitionNTU RGB+DAccuracy (CV)93.1CoAGCN* (2-stream)
Action RecognitionNTU RGB+DGFLOPs per pred0.44CoAGCN* (2-stream)
Action RecognitionNTU RGB+DAccuracy (CS)84.1CoAGCN*
Action RecognitionNTU RGB+DAccuracy (CV)92.6CoAGCN*

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Adapting Vision-Language Models for Evaluating World Models2025-06-22