TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/AntisymmetricRNN: A Dynamical System View on Recurrent Neu...

AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks

Bo Chang, Minmin Chen, Eldad Haber, Ed H. Chi

2019-02-26ICLR 2019 5Sequential Image Classification
PaperPDFCode

Abstract

Recurrent neural networks have gained widespread use in modeling sequential data. Learning long-term dependencies using these models remains difficult though, due to exploding or vanishing gradients. In this paper, we draw connections between recurrent networks and ordinary differential equations. A special form of recurrent networks called the AntisymmetricRNN is proposed under this theoretical framework, which is able to capture long-term dependencies thanks to the stability property of its underlying differential equation. Existing approaches to improving RNN trainability often incur significant computation overhead. In comparison, AntisymmetricRNN achieves the same goal by design. We showcase the advantage of this new architecture through extensive simulations and experiments. AntisymmetricRNN exhibits much more predictable dynamics. It outperforms regular LSTM models on tasks requiring long-term memory and matches the performance on tasks where short-term dependencies dominate despite being much simpler.

Results

TaskDatasetMetricValueModel
Image Classificationnoise padded CIFAR-10% Test Accuracy54.7AntisymmetricRNN w/ gating
Image Classificationnoise padded CIFAR-10% Test Accuracy11.6LSTM

Related Papers

Minion Gated Recurrent Unit for Continual Learning2025-03-08Learning Long Sequences in Spiking Neural Networks2023-12-14Delayed Memory Unit: Modelling Temporal Dependency Through Delay Gate2023-10-23Traveling Waves Encode the Recent Past and Enhance Sequence Learning2023-09-03Sequence Modeling with Multiresolution Convolutional Memory2023-05-02SMPConv: Self-moving Point Representations for Continuous Convolution2023-04-05Resurrecting Recurrent Neural Networks for Long Sequences2023-03-11VCI-LSTM: Vector Choquet Integral-based Long Short-Term Memory2022-11-14