TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Performer

Performer

Natural Language ProcessingIntroduced 2000103 papers
Source Paper

Description

Performer is a Transformer architecture which can estimate regular (softmax) full-rank-attention Transformers with provable accuracy, but using only linear (as opposed to quadratic) space and time complexity, without relying on any priors such as sparsity or low-rankness. Performers are linear architectures fully compatible with regular Transformers and with strong theoretical guarantees: unbiased or nearly-unbiased estimation of the attention matrix, uniform convergence and low estimation variance. To approximate softmax attention-kernels, Performers use a Fast Attention Via positive Orthogonal Random features approach (FAVOR+), leveraging new methods for approximating softmax and Gaussian kernels.

Papers Using This Method

ExpertLongBench: Benchmarking Language Models on Expert-Level Long-Form Generation Tasks with Structured Checklists2025-06-02WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications2025-05-20CacheFormer: High Attention-Based Segment Caching2025-04-18Deconstructing Jazz Piano Style Using Machine Learning2025-04-07Predicting Survivability of Cancer Patients with Metastatic Patterns Using Explainable AI2025-04-07Forecasting Empty Container availability for Vehicle Booking System Application2025-03-14STEAD: Spatio-Temporal Efficient Anomaly Detection for Time and Compute Sensitive Applications2025-03-11Deep Learning-Based Approach for Automatic 2D and 3D MRI Segmentation of Gliomas2025-02-27On the use of Performer and Agent Attention for Spoken Language Identification2025-02-09Nick Patrick Contreras2025-01-09Comparative Study of Deep Learning Architectures for Textual Damage Level Classification2025-01-03ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis2024-12-29Music Genre Classification: Ensemble Learning with Subcomponents-level Attention2024-12-20The Two-Hop Curse: LLMs trained on A$\rightarrow$B, B$\rightarrow$C fail to learn A$\rightarrow$C2024-11-25SOAR: Self-Occluded Avatar Recovery from a Single Video In the Wild2024-10-31PerTok: Expressive Encoding and Modeling of Symbolic Musical Ideas and Variations2024-10-02GLMHA A Guided Low-rank Multi-Head Self-Attention for Efficient Image Restoration and Spectral Reconstruction2024-10-01Large Language Models versus Classical Machine Learning: Performance in COVID-19 Mortality Prediction Using High-Dimensional Tabular Data2024-09-02Equitable Skin Disease Prediction Using Transfer Learning and Domain Adaptation2024-09-01Tangram: Benchmark for Evaluating Geometric Element Recognition in Large Multimodal Models2024-08-25