Papers With Code 2 | ML Benchmarks, SotA Results & Code

Description

A Unitary RNN is a recurrent neural network architecture that uses a unitary hidden to hidden matrix. Specifically they concern dynamics of the form:

$h\_{t} = f\left(Wh\_{t−1} + Vx\_{t}\right)$

where $W$ is a unitary matrix $\left(W^{†}W = I\right)$ . The product of unitary matrices is a unitary matrix, so $W$ can be parameterised as a product of simpler unitary matrices:

$h\_{t} = f\left(D\_{3}R\_{2}F^{−1}D\_{2}PR\_{1}FD\_{1}h\_{t−1} + Vxt\right)$

where $D\_{3}$ , $D\_{2}$ , $D\_{1}$ are learned diagonal complex matrices, and $R\_{2}$ , $R\_{1}$ are learned reflection matrices. Matrices $F$ and $F^{−1}$ are the discrete Fourier transformation and its inverse. P is any constant random permutation. The activation function $f\left(h\right)$ applies a rectified linear unit with a learned bias to the modulus of each complex number. Only the diagonal and reflection matrices, $D$ and $R$ , are learned, so Unitary RNNs have fewer parameters than LSTMs with comparable numbers of hidden units.

Source: Associative LSTMs

Description

A Unitary RNN is a recurrent neural network architecture that uses a unitary hidden to hidden matrix. Specifically they concern dynamics of the form:

$h\_{t} = f\left(Wh\_{t−1} + Vx\_{t}\right)$

where $W$ is a unitary matrix $\left(W^{†}W = I\right)$ . The product of unitary matrices is a unitary matrix, so $W$ can be parameterised as a product of simpler unitary matrices:

$h\_{t} = f\left(D\_{3}R\_{2}F^{−1}D\_{2}PR\_{1}FD\_{1}h\_{t−1} + Vxt\right)$

Source: Associative LSTMs

Unitary RNN

Description

Papers Using This Method

Unitary RNN

Description

Papers Using This Method