Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Performer

Performer

Natural Language ProcessingIntroduced 2000103 papers

Description

Performer is a Transformer architecture which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness. Performers are linear architectures fully compatible with regular Transformers and with strong theoretical guarantees: unbiased or nearly-unbiased estimation of the attention matrix, uniform convergence and low estimation variance. To approximate softmax attention-kernels, Performers use a Fast Attention Via positive Orthogonal Random features approach (FAVOR+), leveraging new methods for approximating softmax and Gaussian kernels.

Papers Using This Method

ExpertLongBench: Benchmarking Language Models on Expert-Level Long-Form Generation Tasks with Structured Checklists2025-06-02 WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications2025-05-20 CacheFormer: High Attention-Based Segment Caching2025-04-18 Deconstructing Jazz Piano Style Using Machine Learning2025-04-07 Predicting Survivability of Cancer Patients with Metastatic Patterns Using Explainable AI2025-04-07 Forecasting Empty Container availability for Vehicle Booking System Application2025-03-14 STEAD: Spatio-Temporal Efficient Anomaly Detection for Time and Compute Sensitive Applications2025-03-11 Deep Learning-Based Approach for Automatic 2D and 3D MRI Segmentation of Gliomas2025-02-27 On the use of Performer and Agent Attention for Spoken Language Identification2025-02-09 Nick Patrick Contreras2025-01-09 Comparative Study of Deep Learning Architectures for Textual Damage Level Classification2025-01-03 ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis2024-12-29 Music Genre Classification: Ensemble Learning with Subcomponents-level Attention2024-12-20 The Two-Hop Curse: LLMs trained on A$\rightarrow$B, B$\rightarrow$C fail to learn A$\rightarrow$C2024-11-25 SOAR: Self-Occluded Avatar Recovery from a Single Video In the Wild2024-10-31 PerTok: Expressive Encoding and Modeling of Symbolic Musical Ideas and Variations2024-10-02 GLMHA A Guided Low-rank Multi-Head Self-Attention for Efficient Image Restoration and Spectral Reconstruction2024-10-01 Large Language Models versus Classical Machine Learning: Performance in COVID-19 Mortality Prediction Using High-Dimensional Tabular Data2024-09-02 Equitable Skin Disease Prediction Using Transfer Learning and Domain Adaptation2024-09-01 Tangram: Benchmark for Evaluating Geometric Element Recognition in Large Multimodal Models2024-08-25