Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Absolute Position Encodings

Absolute Position Encodings

GeneralIntroduced 200013947 papers

Description

Absolute Position Encodings are a type of position embeddings for [Transformer-based models] where positional encodings are added to the input embeddings at the bottoms of the encoder and decoder stacks. The positional encodings have the same dimension $d\_{model}$ as the embeddings, so that the two can be summed. In the original implementation, sine and cosine functions of different frequencies are used:

$\text{PE}\left(pos, 2i\right) = \sin\left(pos/10000^{2i/d\_{model}}\right)$

$\text{PE}\left(pos, 2i+1\right) = \cos\left(pos/10000^{2i/d\_{model}}\right)$

where $10000 \dot 2\pi$ is the position and $i$ is the dimension. That is, each dimension of the positional encoding corresponds to a sinusoid. The wavelengths form a geometric progression from $2\pi$ to $10000 \dot 2\pi$ . This function was chosen because the authors hypothesized it would allow the model to easily learn to attend by relative positions, since for any fixed offset $k$ , $\text{PE}\_{pos+k}$ can be represented as a linear function of $\text{PE}\_{pos}$ .

Image Source: D2L.ai

Papers Using This Method

DASViT: Differentiable Architecture Search for Vision Transformer2025-07-17 Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows2025-07-16 DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16 Langevin Flows for Modeling Neural Latent Dynamics2025-07-15 Biological Processing Units: Leveraging an Insect Connectome to Pioneer Biofidelic Neural Architectures2025-07-15 KV-Latent: Dimensional-level KV Cache Reduction with Frequency-aware Rotary Positional Embedding2025-07-15 Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking2025-07-15 Token Compression Meets Compact Vision Transformers: A Survey and Comparative Evaluation for Edge AI2025-07-13 Learning from Synthetic Labs: Language Models as Auction Participants2025-07-12 Comparative Analysis of Vision Transformers and Traditional Deep Learning Approaches for Automated Pneumonia Detection in Chest X-Rays2025-07-11 Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving2025-07-08 Geo-Registration of Terrestrial LiDAR Point Clouds with Satellite Images without GNSS2025-07-08 Tile-Based ViT Inference with Visual-Cluster Priors for Zero-Shot Multi-Species Plant Identification2025-07-08 A Wireless Foundation Model for Multi-Task Prediction2025-07-08 Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate2025-07-08 SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model2025-07-07 Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations2025-07-07 Estimating Interventional Distributions with Uncertain Causal Graphs through Meta-Learning2025-07-07 AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models2025-07-07 Fast and Simplex: 2-Simplicial Attention in Triton2025-07-03