TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Long Range Arena: A Benchmark for Efficient Transformers

Long Range Arena: A Benchmark for Efficient Transformers

Yi Tay, Mostafa Dehghani, Samira Abnar, Yikang Shen, Dara Bahri, Philip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler

2020-11-08Spatial ReasoningBenchmarking16kListOpsLong-range modeling
PaperPDFCodeCodeCodeCode(official)Code

Abstract

Transformers do not scale very well to long sequence lengths largely because of quadratic self-attention complexity. In the recent months, a wide spectrum of efficient, fast Transformers have been proposed to tackle this problem, more often than not claiming superior or comparable model quality to vanilla Transformer models. To this date, there is no well-established consensus on how to evaluate this class of models. Moreover, inconsistent benchmarking on a wide spectrum of tasks and datasets makes it difficult to assess relative model quality amongst many models. This paper proposes a systematic and unified benchmark, LRA, specifically focused on evaluating model quality under long-context scenarios. Our benchmark is a suite of tasks consisting of sequences ranging from $1K$ to $16K$ tokens, encompassing a wide range of data types and modalities such as text, natural, synthetic images, and mathematical expressions requiring similarity, structural, and visual-spatial reasoning. We systematically evaluate ten well-established long-range Transformer models (Reformers, Linformers, Linear Transformers, Sinkhorn Transformers, Performers, Synthesizers, Sparse Transformers, and Longformers) on our newly proposed benchmark suite. LRA paves the way towards better understanding this class of efficient Transformer models, facilitates more research in this direction, and presents new challenging tasks to tackle. Our benchmark code will be released at https://github.com/google-research/long-range-arena.

Results

TaskDatasetMetricValueModel
Language ModellingLRAAvg54.39Transformer
Language ModellingLRAImage42.44Transformer
Language ModellingLRAListOps36.37Transformer
Language ModellingLRAPathfinder71.4Transformer
Language ModellingLRARetrieval57.46Transformer
Language ModellingLRAText64.27Transformer
Language ModellingLRAAvg51.41Performer
Language ModellingLRAImage42.77Performer
Language ModellingLRAListOps18.01Performer
Language ModellingLRAPathfinder77.05Performer
Language ModellingLRARetrieval53.82Performer
Language ModellingLRAText65.4Performer
Language ModellingLRAAvg51.24Sparse Trans.
Language ModellingLRAImage44.24Sparse Trans.
Language ModellingLRAListOps17.07Sparse Trans.
Language ModellingLRAPathfinder71.71Sparse Trans.
Language ModellingLRARetrieval59.59Sparse Trans.
Language ModellingLRAText63.58Sparse Trans.
Language ModellingLRAAvg50.55Linear Trans.
Language ModellingLRAImage42.34Linear Trans.
Language ModellingLRAListOps16.13Linear Trans.
Language ModellingLRAPathfinder75.3Linear Trans.
Language ModellingLRARetrieval53.09Linear Trans.
Language ModellingLRAText65.9Linear Trans.

Related Papers

Visual Place Recognition for Large-Scale UAV Applications2025-07-20Training Transformers with Enforced Lipschitz Constants2025-07-17Disentangling coincident cell events using deep transfer learning and compressive sensing2025-07-17MUPAX: Multidimensional Problem Agnostic eXplainable AI2025-07-17MindJourney: Test-Time Scaling with World Models for Spatial Reasoning2025-07-16DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition2025-07-16DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15A Multi-View High-Resolution Foot-Ankle Complex Point Cloud Dataset During Gait for Occlusion-Robust 3D Completion2025-07-15