TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/EleAtt-RNN: Adding Attentiveness to Neurons in Recurrent N...

EleAtt-RNN: Adding Attentiveness to Neurons in Recurrent Neural Networks

Pengfei Zhang, Jianru Xue, Cuiling Lan, Wen-Jun Zeng, Zhanning Gao, Nanning Zheng

2019-09-03Skeleton Based Action RecognitionGesture RecognitionAction Recognition
PaperPDF

Abstract

Recurrent neural networks (RNNs) are capable of modeling temporal dependencies of complex sequential data. In general, current available structures of RNNs tend to concentrate on controlling the contributions of current and previous information. However, the exploration of different importance levels of different elements within an input vector is always ignored. We propose a simple yet effective Element-wise-Attention Gate (EleAttG), which can be easily added to an RNN block (e.g. all RNN neurons in an RNN layer), to empower the RNN neurons to have attentiveness capability. For an RNN block, an EleAttG is used for adaptively modulating the input by assigning different levels of importance, i.e., attention, to each element/dimension of the input. We refer to an RNN block equipped with an EleAttG as an EleAtt-RNN block. Instead of modulating the input as a whole, the EleAttG modulates the input at fine granularity, i.e., element-wise, and the modulation is content adaptive. The proposed EleAttG, as an additional fundamental unit, is general and can be applied to any RNN structures, e.g., standard RNN, Long Short-Term Memory (LSTM), or Gated Recurrent Unit (GRU). We demonstrate the effectiveness of the proposed EleAtt-RNN by applying it to different tasks including the action recognition, from both skeleton-based data and RGB videos, gesture recognition, and sequential MNIST classification. Experiments show that adding attentiveness through EleAttGs to RNN blocks significantly improves the power of RNNs.

Results

TaskDatasetMetricValueModel
VideoNTU RGB+DAccuracy (CS)80.7EleAtt-GRU (aug.)
VideoNTU RGB+DAccuracy (CV)88.4EleAtt-GRU (aug.)
Temporal Action LocalizationNTU RGB+DAccuracy (CS)80.7EleAtt-GRU (aug.)
Temporal Action LocalizationNTU RGB+DAccuracy (CV)88.4EleAtt-GRU (aug.)
Zero-Shot LearningNTU RGB+DAccuracy (CS)80.7EleAtt-GRU (aug.)
Zero-Shot LearningNTU RGB+DAccuracy (CV)88.4EleAtt-GRU (aug.)
Activity RecognitionNTU RGB+DAccuracy (CS)80.7EleAtt-GRU (aug.)
Activity RecognitionNTU RGB+DAccuracy (CV)88.4EleAtt-GRU (aug.)
Action LocalizationNTU RGB+DAccuracy (CS)80.7EleAtt-GRU (aug.)
Action LocalizationNTU RGB+DAccuracy (CV)88.4EleAtt-GRU (aug.)
Action DetectionNTU RGB+DAccuracy (CS)80.7EleAtt-GRU (aug.)
Action DetectionNTU RGB+DAccuracy (CV)88.4EleAtt-GRU (aug.)
3D Action RecognitionNTU RGB+DAccuracy (CS)80.7EleAtt-GRU (aug.)
3D Action RecognitionNTU RGB+DAccuracy (CV)88.4EleAtt-GRU (aug.)
Action RecognitionNTU RGB+DAccuracy (CS)80.7EleAtt-GRU (aug.)
Action RecognitionNTU RGB+DAccuracy (CV)88.4EleAtt-GRU (aug.)

Related Papers

Efficient Deployment of Spiking Neural Networks on SpiNNaker2 for DVS Gesture Recognition Using Neuromorphic Intermediate Representation2025-09-04A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains2025-07-17Visual Hand Gesture Recognition with Deep Learning: A Comprehensive Review of Methods, Datasets, Challenges and Future Research Directions2025-07-06Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment2025-07-01EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception2025-06-26How do Foundation Models Compare to Skeleton-Based Approaches for Gesture Recognition in Human-Robot Interaction?2025-06-25Feature Hallucination for Self-supervised Action Recognition2025-06-25CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition2025-06-25