Papers With Code 2 | ML Benchmarks, SotA Results & Code

Description

The Quasi-Hyperbolic Momentum Algorithm (QHM) is a simple alteration of momentum SGD, averaging a plain SGD step with a momentum step. QHAdam is a QH augmented version of Adam, where we replace both of Adam's moment estimators with quasi-hyperbolic terms. QHAdam decouples the momentum term from the current gradient when updating the weights, and decouples the mean squared gradients term from the current squared gradient when updating the weights.

In essence, it is a weighted average of the momentum and plain SGD, weighting the current gradient with an immediate discount factor $v\_{1}$ divided by a weighted average of the mean squared gradients and the current squared gradient, weighting the current squared gradient with an immediate discount factor $v\_{2}$ .

$\theta\_{t+1, i} = \theta\_{t, i} - \eta\left[\frac{\left(1-v\_{1}\right)\cdot{g\_{t}} + v\_{1}\cdot\hat{m}\_{t}}{\sqrt{\left(1-v\_{2}\right)g^{2}\_{t} + v\_{2}\cdot{\hat{v}\_{t}}} + \epsilon}\right], \forall{t}$

It is recommended to set $v\_{2} = 1$ and $\beta\_{2}$ same as in Adam.

Description

It is recommended to set $v\_{2} = 1$ and $\beta\_{2}$ same as in Adam.

QHAdam

Description

Papers Using This Method

QHAdam

Description

Papers Using This Method