Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Transformer

Transformer

Natural Language ProcessingIntroduced 200014004 papers

Description

A Transformer is a model architecture that eschews recurrence and instead relies entirely on an attention mechanism to draw global dependencies between input and output. Before Transformers, the dominant sequence transduction models were based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The Transformer also employs an encoder and decoder, but removing recurrence in favor of attention mechanisms allows for significantly more parallelization than methods like RNNs and CNNs.

Papers Using This Method

DASViT: Differentiable Architecture Search for Vision Transformer2025-07-17 Best Practices for Large-Scale, Pixel-Wise Crop Mapping and Transfer Learning Workflows2025-07-16 DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16 Langevin Flows for Modeling Neural Latent Dynamics2025-07-15 Biological Processing Units: Leveraging an Insect Connectome to Pioneer Biofidelic Neural Architectures2025-07-15 KV-Latent: Dimensional-level KV Cache Reduction with Frequency-aware Rotary Positional Embedding2025-07-15 Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking2025-07-15 Token Compression Meets Compact Vision Transformers: A Survey and Comparative Evaluation for Edge AI2025-07-13 Learning from Synthetic Labs: Language Models as Auction Participants2025-07-12 Comparative Analysis of Vision Transformers and Traditional Deep Learning Approaches for Automated Pneumonia Detection in Chest X-Rays2025-07-11 Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving2025-07-08 Geo-Registration of Terrestrial LiDAR Point Clouds with Satellite Images without GNSS2025-07-08 Tile-Based ViT Inference with Visual-Cluster Priors for Zero-Shot Multi-Species Plant Identification2025-07-08 A Wireless Foundation Model for Multi-Task Prediction2025-07-08 Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate2025-07-08 SV-DRR: High-Fidelity Novel View X-Ray Synthesis Using Diffusion Model2025-07-07 Emergent Semantics Beyond Token Embeddings: Transformer LMs with Frozen Visual Unicode Representations2025-07-07 Estimating Interventional Distributions with Uncertain Causal Graphs through Meta-Learning2025-07-07 AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models2025-07-07 Fast and Simplex: 2-Simplicial Attention in Triton2025-07-03