s⋅(ReLU(x))2+bs \cdot (\mathrm{ReLU}(x))^2 + bs⋅(ReLU(x))2+b
where s∈Rs \in \mathbb{R}s∈R and b∈Rb \in \mathbb{R}b∈R are shared for all channels and can be set as constants (s=0.8944, b=-0.4472) or learnable parameters.