TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/AggMo

AggMo

GeneralIntroduced 20001 papers
Source Paper

Description

Aggregated Momentum (AggMo) is a variant of the classical momentum stochastic optimizer which maintains several velocity vectors with different β\betaβ parameters. AggMo averages the velocity vectors when updating the parameters. It resolves the problem of choosing a momentum parameter by taking a linear combination of multiple momentum buffers. Each of KKK momentum buffers have a different discount factor β∈RK\beta \in \mathbb{R}^{K}β∈RK, and these are averaged for the update. The update rule is:

v_t(i)=β(i)v_t−1(i)−∇_θf(θ_t−1)\textbf{v}\_{t}^{\left(i\right)} = \beta^{(i)}\textbf{v}\_{t-1}^{\left(i\right)} - \nabla\_{\theta}f\left(\mathbf{\theta}\_{t-1}\right)v_t(i)=β(i)v_t−1(i)−∇_θf(θ_t−1)

θ_t=θ_t−1+γ_tK∑K_i=1v_t(i)\mathbf{\theta\_{t}} = \mathbf{\theta\_{t-1}} + \frac{\gamma\_{t}}{K}\sum^{K}\_{i=1}\textbf{v}\_{t}^{\left(i\right)}θ_t=θ_t−1+Kγ_t​∑K_i=1v_t(i)

where v0(i)v^{\left(i\right)}_{0}v0(i)​ for each iii. The vector β=[β(1),…,β(K)]\mathcal{\beta} = \left[\beta^{(1)}, \ldots, \beta^{(K)}\right]β=[β(1),…,β(K)] is the dampening factor.

Papers Using This Method

Aggregated Momentum: Stability Through Passive Damping2018-04-01