Universal Transformer

Natural Language ProcessingIntroduced 200018 papers

Description

The Universal Transformer is a generalization of the Transformer architecture. Universal Transformers combine the parallelizability and global receptive field of feed-forward sequence models like the Transformer with the recurrent inductive bias of RNNs. They also utilise a dynamic per-position halting mechanism.

Papers Using This Method

PLUTO: Pathology-Universal Transformer2024-05-13 Investigating Recurrent Transformers with Dynamic Halt2024-02-01 Self-Critical Alternate Learning based Semantic Broadcast Communication2023-12-03 Sparse Universal Transformer2023-10-11 UMMAFormer: A Universal Multimodal-adaptive Transformer Framework for Temporal Forgery Localization2023-08-28 Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition2023-03-23 Semantic Communication with Memory2023-03-22 Towards Autoformalization of Mathematics and Code Correctness: Experiments with Elementary Proofs2023-01-05 Universal Transformer Hawkes Process with Adaptive Recursive Iteration2021-12-29 The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers2021-08-26 Using BERT Encoding and Sentence-Level Language Model for Sentence Ordering2021-08-24 Semantic Communication with Adaptive Universal Transformer2021-08-20 Automatically Ranked Russian Paraphrase Corpus for Text Generation2020-06-17 Universal Transforming Geometric Network2019-08-02 Latent Universal Task-Specific BERT2019-05-16 Self-Attentive Model for Headline Generation2019-01-23 Attending to Mathematical Language with Transformers2018-12-05 Universal Transformers2018-07-10