Description
The Beneš block is a computation-efficient alternative to dense attention, enabling the modelling of long-range dependencies in O(n log n) time. In comparison, dense attention which is commonly used in Transformers has O(n^2) complexity.
In music, dependencies occur on several scales, including on a coarse scale which requires processing very long sequences. Beneš blocks have been used in Residual Shuffle-Exchange Networks to achieve state-of-the-art results in music transcription.
Beneš blocks have a ‘receptive field’ of the size of the whole sequence, and it has no bottleneck. These properties hold for dense attention but have not been shown for many sparse attention and dilated convolutional architectures.
Papers Using This Method
Nash Equilibrium Between Consumer Electronic Devices and DoS Attacker for Distributed IoT-enabled RSE Systems2025-04-13The radius of statistical efficiency2024-05-15A Multi-Modal Machine Learning Approach to Detect Extreme Rainfall Events in Sicily2022-12-14Attack-Resilient State Estimation with Intermittent Data Authentication2020-05-16Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences2020-04-06