TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/G-GLN Neuron

G-GLN Neuron

GeneralIntroduced 20001 papers
Source Paper

Description

A G-GLN Neuron is a type of neuron used in the G-GLN architecture. G-GLN. The key idea is that further representational power can be added to a weighted product of Gaussians via a contextual gating procedure. This is achieved by extending a weighted product of Gaussians model with an additional type of input called side information. The side information will be used by a neuron to select a weight vector to apply for a given example from a table of weight vectors. In typical applications to regression, the side information is defined as the (normalized) input features for an input example: i.e. z=(x−xˉ)/σ_xz=(x-\bar{x}) / \sigma\_{x}z=(x−xˉ)/σ_x.

More formally, associated with each neuron is a context function c:Z→Cc: \mathcal{Z} \rightarrow \mathcal{C}c:Z→C, where Z\mathcal{Z}Z is the set of possible side information and C={0,…,k−1}\mathcal{C}=\{0, \ldots, k-1\}C={0,…,k−1} for some k∈Nk \in \mathbb{N}k∈N is the context space. Each neuron iii is now parameterized by a weight matrix W_i=[w_i,0…w_i,k−1]⊤W\_{i}=\left[w\_{i, 0} \ldots w\_{i, k-1}\right]^{\top}W_i=[w_i,0…w_i,k−1]⊤ with each row vector w_ij∈Ww\_{i j} \in \mathcal{W}w_ij∈W for 0≤j<k0 \leq j<k0≤j<k. The context function ccc is responsible for mapping side information z∈Zz \in \mathcal{Z}z∈Z to a particular row w_i,c(z)w\_{i, c(z)}w_i,c(z) of WiW_{i}Wi​, which we then use to weight the Product of Gaussians. In other words, a G-GLN neuron can be defined by:

PoG⁡_Wc(y;f1(⋅),…,f_m(⋅),z):=PoG⁡_wc(z)(y;f_1(⋅),…,f_m(⋅))\operatorname{PoG}\_{W}^{c}\left(y ; f_{1}(\cdot), \ldots, f\_{m}(\cdot), z\right):=\operatorname{PoG}\_{w^{c(z)}}\left(y ; f\_{1}(\cdot), \ldots, f\_{m}(\cdot)\right)PoG_Wc(y;f1​(⋅),…,f_m(⋅),z):=PoG_wc(z)(y;f_1(⋅),…,f_m(⋅))

with the associated loss function −log⁡(PoG⁡_Wc(y;f_1(y),…,f_m(y),z))-\log \left(\operatorname{PoG}\_{W}^{c}\left(y ; f\_{1}(y), \ldots, f\_{m}(y), z\right)\right)−log(PoG_Wc(y;f_1(y),…,f_m(y),z)) inheriting all the properties needed to apply Online Convex Programming.

Papers Using This Method

Gaussian Gated Linear Networks2020-06-10