TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Neural Turing Machine

Neural Turing Machine

SequentialIntroduced 200022 papers
Source Paper

Description

A Neural Turing Machine is a working memory neural network model. It couples a neural network architecture with external memory resources. The whole architecture is differentiable end-to-end with gradient descent. The models can infer tasks such as copying, sorting and associative recall.

A Neural Turing Machine (NTM) architecture contains two basic components: a neural network controller and a memory bank. The Figure presents a high-level diagram of the NTM architecture. Like most neural networks, the controller interacts with the external world via input and output vectors. Unlike a standard network, it also interacts with a memory matrix using selective read and write operations. By analogy to the Turing machine we refer to the network outputs that parameterise these operations as “heads.”

Every component of the architecture is differentiable. This is achieved by defining 'blurry' read and write operations that interact to a greater or lesser degree with all the elements in memory (rather than addressing a single element, as in a normal Turing machine or digital computer). The degree of blurriness is determined by an attentional “focus” mechanism that constrains each read and write operation to interact with a small portion of the memory, while ignoring the rest. Because interaction with the memory is highly sparse, the NTM is biased towards storing data without interference. The memory location brought into attentional focus is determined by specialised outputs emitted by the heads. These outputs define a normalised weighting over the rows in the memory matrix (referred to as memory “locations”). Each weighting, one per read or write head, defines the degree to which the head reads or writes at each location. A head can thereby attend sharply to the memory at a single location or weakly to the memory at many locations

Papers Using This Method

Intelligent DoS and DDoS Detection: A Hybrid GRU-NTM Approach to Network Security2025-04-10Memory-augmented conformer for improved end-to-end long-form ASR2023-09-22FashionNTM: Multi-turn Fashion Image Retrieval via Cascaded Memory2023-08-20Token Turing Machines2022-11-16Unsupervised Speaker Adaptation using Attention-based Speaker Memory for End-to-End ASR2020-02-14Memory-Augmented Recurrent Networks for Dialogue Coherence2019-10-16A Neural Turing~Machine for Conditional Transition Graph Modeling2019-07-15Understanding Memory Modules on Learning Simple Algorithms2019-07-01A review on Neural Turing Machine2019-04-10Few-Shot Generalization Across Dialogue Tasks2018-11-28Context-Aware Neural Model for Temporal Information Extraction2018-07-01A Taxonomy for Neural Memory Networks2018-05-01Meta-Learning via Feature-Label Memory Network2017-10-19Attention-Set based Metric Learning for Video Face Recognition2017-04-12Tracking the World State with Recurrent Entity Networks2016-12-12Neural Turing Machines: Convergence of Copy Tasks2016-12-07Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes2016-06-30Lie Access Neural Turing Machine2016-02-28Empirical Study on Deep Learning Models for Question Answering2015-10-26A Deep Memory-based Architecture for Sequence-to-Sequence Learning2015-06-22