Adaptive Input Representations

Natural Language ProcessingIntroduced 200066 papers

Description

Adaptive Input Embeddings extend the adaptive softmax to input word representations. The factorization assigns more capacity to frequent words and reduces the capacity for less frequent words with the benefit of reducing overfitting to rare words.

Papers Using This Method

RLBenchNet: The Right Network for the Right Reinforcement Learning Task2025-05-21 A Combined Encoder and Transformer Approach for Coherent and High-Quality Text Generation2024-11-19 Large Body Language Models2024-10-21 Transformers for Supervised Online Continual Learning2024-03-03 UniMem: Towards a Unified View of Long-Context Large Language Models2024-02-05 Memory-efficient Stochastic methods for Memory-based Transformers2023-11-14 TRAMS: Training-free Memory Selection for Long-range Language Modeling2023-10-24 Approximating Two-Layer Feedforward Networks for Efficient Transformers2023-10-16 Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents2023-09-29 Random-Access Infinite Context Length for Transformers2023-09-21 RCMHA: Relative Convolutional Multi-Head Attention for Natural Language Modelling2023-08-07 Landmark Attention: Random-Access Infinite Context Length for Transformers2023-05-25 Transformer-based World Models Are Happy With 100k Interactions2023-03-13 GTR-CTRL: Instrument and Genre Conditioning for Guitar-Focused Music Generation with Transformers2023-02-10 An Comparative Analysis of Different Pitch and Metrical Grid Encoding Methods in the Task of Sequential Music Generation2023-01-31 Efficient Sparsely Activated Transformers2022-08-31 Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models2022-08-13 Recurrent Memory Transformer2022-07-14 Emotion-Aware Transformer Encoder for Empathetic Dialogue Generation2022-04-24 SinTra: Learning an inspiration model from a single multi-track music segment2022-04-21