Contrasting LMU with LSTM

Anonymous

2022-01-17ICLR Track Blog 2022 5

Abstract

Both Hidden Markov Model (HMM) and Recurrent Neural Network (RNN) suffer from disappearing transitions and (vanishing \& exploding) gradient problems. LSTM maintains a long time-range dependency on a sequencing task. However, information flow in the network tends to saturate once the number of time steps exceeds a few thousand. Legendre Memory Unit (LMU) is a revolutionary evolution on the design of RNN that can conveniently handle extremely long-range dependency. Let's try to figure out why the LMU exceeds the performance of the LSTM in this blog.