Papers With Code 2 | ML Benchmarks, SotA Results & Code

Description

A Gated Linear Network, or GLN, is a type of backpropagation-free neural architecture. What distinguishes GLNs from contemporary neural networks is the distributed and local nature of their credit assignment mechanism; each neuron directly predicts the target, forgoing the ability to learn feature representations in favor of rapid online learning. Individual neurons can model nonlinear functions via the use of data-dependent gating in conjunction with online convex optimization.

GLNs are feedforward networks composed of many layers of gated geometric mixing neurons as shown in the Figure . Each neuron in a given layer outputs a gated geometric mixture of the predictions from the previous layer, with the final layer consisting of just a single neuron. In a supervised learning setting, a $\mathrm{GLN}$ is trained on (side information, base predictions, label) triplets $\left(z\_{t}, p\_{t}, x\_{t}\right)_{t=1,2,3, \ldots}$ derived from input-label pairs $\left(z\_{t}, x\_{t}\right)$ . There are two types of input to neurons in the network: the first is the side information $z\_{t}$ , which can be thought of as the input features; the second is the input to the neuron, which will be the predictions output by the previous layer, or in the case of layer 0 , some (optionally) provided base predictions $p\_{t}$ that typically will be a function of $z\_{t} .$ Each neuron will also take in a constant bias prediction, which helps empirically and is essential for universality guarantees.

Weights are learnt in a Gated Linear Network using Online Gradient Descent (OGD) locally at each neuron. They key observation is that as each neuron $(i, k)$ in layers $i>0$ is itself a gated geometric mixture, all of these neurons can be thought of as individually predicting the target. Given side information $z$ , each neuron $(i, k)$ suffers a loss convex in its active weights $u:=w\_{i k c\_{i k}(z)}$ of

\ell\_{t}(u):=-\log \left(\operatorname{GEO}\_{u}\left(x_{t} ; p\_{i-1}\right)\right)

Description

\ell\_{t}(u):=-\log \left(\operatorname{GEO}\_{u}\left(x_{t} ; p\_{i-1}\right)\right)

GLN

Description

Papers Using This Method

GLN

Description

Papers Using This Method