DRA

Dynamic Range Activator

GeneralIntroduced 20004 papers

Description

Recursive functions with heteroscedasticity, sparse and high-variance target distributions introduces a huge complexity that makes their accurate modeling with Neural Networks a difficult task. A main property of recursive maps (e.g factorial function), is their dramatic growth and drop. Learning this recursive behavior requires not only fitting high-frequency patterns within a bounded region but also successfully extrapolating those patterns beyond that region. In time series prediction tasks, capturing periodic even behavior is a challenge. Various methods have been employed to model periodic patterns effectively. However, these approaches typically deal with uni-modal data that also exhibit relatively low variance in both In-Distribution (ID) and Out-Of-Distribution (OOD) regions and do not generalize well to recursive problems with the high-variance observed in our context. Thus, to enable Transformers to capture such behavior and perform proper inference for multi-modal recursive problems, we enhance them by introducing the Dynamic Range Activator (DRA). The DRA is designed to handle the recursive and factorial growth properties inherent in enumerative problems with minimal computational overhead and can be integrated into existing neural networks without requiring significant architectural changes. DRA integrates both harmonic and hyperbolic components as follows, \begin{equation} \mathrm{DRA}(x) := x + a \sin^2\left(\frac{x}{b}\right) + c \cos(bx) + d \tanh(bx) ,, \end{equation} where a,b,c,da, b, c, d are learnable parameters. It allows the function to simultaneously model periodic data (through sine and cosine) and rapid growth or attenuation (through the hyperbolic tangent) response.

Papers Using This Method