Description
Rotary Position Embedding, or RoPE, is a type of position embedding which encodes absolute positional information with rotation matrix and naturally incorporates explicit relative position dependency in self-attention formulation. Notably, RoPE comes with valuable properties such as flexibility of being expand to any sequence lengths, decaying inter-token dependency with increasing relative distances, and capability of equipping the linear self-attention with relative position encoding.
Papers Using This Method
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation2024-07-05Mitigate Position Bias in Large Language Models via Scaling a Single Dimension2024-06-04Llama 2: Open Foundation and Fine-Tuned Chat Models2023-07-18FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness2022-05-27PaLM: Scaling Language Modeling with Pathways2022-04-05Hierarchical Transformers Are More Efficient Language Models2021-10-26Conformer-based End-to-end Speech Recognition With Rotary Position Embedding2021-07-13RoFormer: Enhanced Transformer with Rotary Position Embedding2021-04-20