Papers With Code 2 | ML Benchmarks, SotA Results & Code

Description

The Adaptive Parametric activation (APA) is defined as: $APA(z,λ,κ) = (λ exp(−κz) + 1) ^{\frac{1}{−λ}}$ , where $λ$ and $κ$ are learnable parameters. This activation function is a generalisation of the Sigmoid and the Gumbel activation functions and it is expressive and versatile. For example, APA can be used inside the channel attention mechanism instead of the Sigmoid activation, or it can be used inside the intermediate layers using the Adaptive Generalised Linear Unit (AGLU): $AGLU(z,λ,κ) = z APA(z,λ,κ)$ .

Description

AGLU

Description

Papers Using This Method

AGLU

Description

Papers Using This Method