Adaptive Input Representations

Natural Language ProcessingIntroduced 200066 papers

Description

Adaptive Input Embeddings extend the adaptive softmax to input word representations. The factorization assigns more capacity to frequent words and reduces the capacity for less frequent words with the benefit of reducing overfitting to rare words.

Papers Using This Method

RLBenchNet: The Right Network for the Right Reinforcement Learning Task2025-05-21A Combined Encoder and Transformer Approach for Coherent and High-Quality Text Generation2024-11-19Large Body Language Models2024-10-21Transformers for Supervised Online Continual Learning2024-03-03UniMem: Towards a Unified View of Long-Context Large Language Models2024-02-05Memory-efficient Stochastic methods for Memory-based Transformers2023-11-14TRAMS: Training-free Memory Selection for Long-range Language Modeling2023-10-24Approximating Two-Layer Feedforward Networks for Efficient Transformers2023-10-16Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents2023-09-29Random-Access Infinite Context Length for Transformers2023-09-21RCMHA: Relative Convolutional Multi-Head Attention for Natural Language Modelling2023-08-07Landmark Attention: Random-Access Infinite Context Length for Transformers2023-05-25Transformer-based World Models Are Happy With 100k Interactions2023-03-13GTR-CTRL: Instrument and Genre Conditioning for Guitar-Focused Music Generation with Transformers2023-02-10An Comparative Analysis of Different Pitch and Metrical Grid Encoding Methods in the Task of Sequential Music Generation2023-01-31Efficient Sparsely Activated Transformers2022-08-31Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models2022-08-13Recurrent Memory Transformer2022-07-14Emotion-Aware Transformer Encoder for Empathetic Dialogue Generation2022-04-24SinTra: Learning an inspiration model from a single multi-track music segment2022-04-21