TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/SiLU

SiLU

Sigmoid Linear Unit

GeneralIntroduced 200030 papers
Source Paper

Description

** Sigmoid Linear Units**, or SiLUs, are activation functions for neural networks. The activation of the SiLU is computed by the sigmoid function multiplied by its input, or xσ(x). x\sigma(x).xσ(x).

See Gaussian Error Linear Units (GELUs) where the SiLU was originally coined, and see Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning and Swish: a Self-Gated Activation Function where the SiLU was experimented with later.

Papers Using This Method

MVNet: Hyperspectral Remote Sensing Image Classification Based on Hybrid Mamba-Transformer Vision Backbone Architecture2025-07-06Deriving Activation Functions Using Integration2024-11-20Sparsing Law: Towards Large Language Models with Greater Activation Sparsity2024-11-04ActNAS : Generating Efficient YOLO Models using Activation NAS2024-10-11UnSeGArmaNet: Unsupervised Image Segmentation using Graph Neural Networks with Convolutional ARMA Filters2024-10-08BrainTransformers: SNN-LLM2024-10-03Efficient Privacy-Preserving KAN Inference Using Homomorphic Encryption2024-09-12CipherDM: Secure Three-Party Inference for Diffusion Model Sampling2024-09-09On Expressive Power of Quantized Neural Networks under Fixed-Point Arithmetic2024-08-30Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation2024-06-24Expanded Gating Ranges Improve Activation Functions2024-05-25Stable and Robust Deep Learning By Hyperbolic Tangent Exponential Linear Unit (TeLU)2024-02-05Leveraging Continuously Differentiable Activation Functions for Learning in Quantized Noisy Environments2024-02-04ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models2023-10-06Learnable Extended Activation Function (LEAF) for Deep Neural Networks2023-09-30Attention-Only Transformers and Implementing MLPs with Attention Heads2023-09-15Compact: Approximating Complex Activation Functions for Secure Computation2023-09-09Deep Contract Design via Discontinuous Networks2023-07-05Demystifying Oversmoothing in Attention-Based Graph Neural Networks2023-05-25Saturated Non-Monotonic Activation Functions2023-05-12