Deep Independently Recurrent Neural Network (IndRNN)

Shuai Li, Wanqing Li, Chris Cook, Yanbo Gao

2019-10-11Skeleton Based Action Recognition Sequential Image Classification Language Modelling

Abstract

Recurrent neural networks (RNNs) are known to be difficult to train due to the gradient vanishing and exploding problems and thus difficult to learn long-term patterns and construct deep networks. To address these problems, this paper proposes a new type of RNNs with the recurrent connection formulated as Hadamard product, referred to as independently recurrent neural network (IndRNN), where neurons in the same layer are independent of each other and connected across layers. Due to the better behaved gradient backpropagation, IndRNN with regulated recurrent weights effectively addresses the gradient vanishing and exploding problems and thus long-term dependencies can be learned. Moreover, an IndRNN can work with non-saturated activation functions such as ReLU (rectified linear unit) and be still trained robustly. Different deeper IndRNN architectures, including the basic stacked IndRNN, residual IndRNN and densely connected IndRNN, have been investigated, all of which can be much deeper than the existing RNNs. Furthermore, IndRNN reduces the computation at each time step and can be over 10 times faster than the commonly used Long short-term memory (LSTM). Experimental results have shown that the proposed IndRNN is able to process very long sequences and construct very deep networks. Better performance has been achieved on various tasks with IndRNNs compared with the traditional RNN, LSTM and the popular Transformer.

Results

Task	Dataset	Metric	Value	Model
Video	NTU RGB+D	Accuracy (CS)	86.7	Dense IndRNN
Video	NTU RGB+D	Accuracy (CV)	93.97	Dense IndRNN
Temporal Action Localization	NTU RGB+D	Accuracy (CS)	86.7	Dense IndRNN
Temporal Action Localization	NTU RGB+D	Accuracy (CV)	93.97	Dense IndRNN
Zero-Shot Learning	NTU RGB+D	Accuracy (CS)	86.7	Dense IndRNN
Zero-Shot Learning	NTU RGB+D	Accuracy (CV)	93.97	Dense IndRNN
Activity Recognition	NTU RGB+D	Accuracy (CS)	86.7	Dense IndRNN
Activity Recognition	NTU RGB+D	Accuracy (CV)	93.97	Dense IndRNN
Language Modelling	Penn Treebank (Word Level)	Test perplexity	50.97	Dense IndRNN+dynamic eval
Language Modelling	Penn Treebank (Word Level)	Test perplexity	56.37	Dense IndRNN
Language Modelling	Penn Treebank (Character Level)	Bit per Character (BPC)	1.18	Dense IndRNN
Action Localization	NTU RGB+D	Accuracy (CS)	86.7	Dense IndRNN
Action Localization	NTU RGB+D	Accuracy (CV)	93.97	Dense IndRNN
Action Detection	NTU RGB+D	Accuracy (CS)	86.7	Dense IndRNN
Action Detection	NTU RGB+D	Accuracy (CV)	93.97	Dense IndRNN
3D Action Recognition	NTU RGB+D	Accuracy (CS)	86.7	Dense IndRNN
3D Action Recognition	NTU RGB+D	Accuracy (CV)	93.97	Dense IndRNN
Action Recognition	NTU RGB+D	Accuracy (CS)	86.7	Dense IndRNN
Action Recognition	NTU RGB+D	Accuracy (CV)	93.97	Dense IndRNN

Deep Independently Recurrent Neural Network (IndRNN)

Abstract

Results

Related Papers

Deep Independently Recurrent Neural Network (IndRNN)

Abstract

Results

Related Papers