Papers With Code 2 | ML Benchmarks, SotA Results & Code

Description

A Multiplicative RNN (mRNN) is a type of recurrent neural network with multiplicative connections. In a standard RNN, the current input $x\_{t}$ is first transformed via the visible-to-hidden weight matrix $W\_{hx}$ and then contributes additively to the input for the current hidden state. An mRNN allows the current input (a character in the original example) to affect the hidden state dynamics by determining the entire hidden-to-hidden matrix (which defines the non-linear dynamics) in addition to providing an additive bias.

To achieve this goal, the authors modify the RNN so that its hidden-to-hidden weight matrix is a (learned) function of the current input $x\_{t}$ :

$h\_{t} = \tanh\left(W\_{hx}x\_{t} + W\_{hh}^{\left(x\_{y}\right)}h\_{t-1} + b\_{h}\right)$

$o\_{t} = W\_{oh}h\_{t} + b\_{o}$

This is the same as the equations for a standard RNN, except that $W\_{hh}$ is replaced with $W^{(xt)}\_{hh}$ . allowing each input (character) to specify a different hidden-to-hidden weight matrix.

Description

To achieve this goal, the authors modify the RNN so that its hidden-to-hidden weight matrix is a (learned) function of the current input $x\_{t}$ :

$h\_{t} = \tanh\left(W\_{hx}x\_{t} + W\_{hh}^{\left(x\_{y}\right)}h\_{t-1} + b\_{h}\right)$

$o\_{t} = W\_{oh}h\_{t} + b\_{o}$

This is the same as the equations for a standard RNN, except that $W\_{hh}$ is replaced with $W^{(xt)}\_{hh}$ . allowing each input (character) to specify a different hidden-to-hidden weight matrix.

mRNN

Description

Papers Using This Method

mRNN

Description

Papers Using This Method