Description
A log-time alternative to feedforward layers outperforming both the vanilla feedforward and mixture-of-experts approaches.
Papers Using This Method
MagicStyle: Portrait Stylization Based on Reference Image2024-09-12The Dial-a-Ride Problem with Limited Pickups per Trip2024-08-14Mathematical Methods for Assessing the Accuracy of Pre-Planned and Guided Surgical Osteotomies2024-06-04Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf Node2024-05-27Exponentially Faster Language Modelling2023-11-15Fast Feedforward Networks2023-08-28