TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Adding Attentiveness to the Neurons in Recurrent Neural Ne...

Adding Attentiveness to the Neurons in Recurrent Neural Networks

Pengfei Zhang, Jianru Xue, Cuiling Lan, Wen-Jun Zeng, Zhanning Gao, Nanning Zheng

2018-07-12ECCV 2018 9Skeleton Based Action RecognitionAction RecognitionTemporal Action Localization
PaperPDF

Abstract

Recurrent neural networks (RNNs) are capable of modeling the temporal dynamics of complex sequential information. However, the structures of existing RNN neurons mainly focus on controlling the contributions of current and historical information but do not explore the different importance levels of different elements in an input vector of a time slot. We propose adding a simple yet effective Element-wiseAttention Gate (EleAttG) to an RNN block (e.g., all RNN neurons in a network layer) that empowers the RNN neurons to have the attentiveness capability. For an RNN block, an EleAttG is added to adaptively modulate the input by assigning different levels of importance, i.e., attention, to each element/dimension of the input. We refer to an RNN block equipped with an EleAttG as an EleAtt-RNN block. Specifically, the modulation of the input is content adaptive and is performed at fine granularity, being element-wise rather than input-wise. The proposed EleAttG, as an additional fundamental unit, is general and can be applied to any RNN structures, e.g., standard RNN, Long Short-Term Memory (LSTM), or Gated Recurrent Unit (GRU). We demonstrate the effectiveness of the proposed EleAtt-RNN by applying it to the action recognition tasks on both 3D human skeleton data and RGB videos. Experiments show that adding attentiveness through EleAttGs to RNN blocks significantly boosts the power of RNNs.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+DAccuracy (CS)79.8EleAtt-GRU
VideoNTU RGB+DAccuracy (CV)87.1EleAtt-GRU
Temporal Action LocalizationNTU RGB+DAccuracy (CS)79.8EleAtt-GRU
Temporal Action LocalizationNTU RGB+DAccuracy (CV)87.1EleAtt-GRU
Zero-Shot LearningNTU RGB+DAccuracy (CS)79.8EleAtt-GRU
Zero-Shot LearningNTU RGB+DAccuracy (CV)87.1EleAtt-GRU
Activity RecognitionNTU RGB+DAccuracy (CS)79.8EleAtt-GRU
Activity RecognitionNTU RGB+DAccuracy (CV)87.1EleAtt-GRU
Action LocalizationNTU RGB+DAccuracy (CS)79.8EleAtt-GRU
Action LocalizationNTU RGB+DAccuracy (CV)87.1EleAtt-GRU
Action DetectionNTU RGB+DAccuracy (CS)79.8EleAtt-GRU
Action DetectionNTU RGB+DAccuracy (CV)87.1EleAtt-GRU
3D Action RecognitionNTU RGB+DAccuracy (CS)79.8EleAtt-GRU
3D Action RecognitionNTU RGB+DAccuracy (CV)87.1EleAtt-GRU
Action RecognitionNTU RGB+DAccuracy (CS)79.8EleAtt-GRU
Action RecognitionNTU RGB+DAccuracy (CV)87.1EleAtt-GRU

Related Papers

A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition2025-06-23Adapting Vision-Language Models for Evaluating World Models2025-06-22