TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/QHAdam

QHAdam

GeneralIntroduced 20001 papers
Source Paper

Description

The Quasi-Hyperbolic Momentum Algorithm (QHM) is a simple alteration of momentum SGD, averaging a plain SGD step with a momentum step. QHAdam is a QH augmented version of Adam, where we replace both of Adam's moment estimators with quasi-hyperbolic terms. QHAdam decouples the momentum term from the current gradient when updating the weights, and decouples the mean squared gradients term from the current squared gradient when updating the weights.

In essence, it is a weighted average of the momentum and plain SGD, weighting the current gradient with an immediate discount factor v_1v\_{1}v_1 divided by a weighted average of the mean squared gradients and the current squared gradient, weighting the current squared gradient with an immediate discount factor v_2v\_{2}v_2.

θ_t+1,i=θ_t,i−η[(1−v_1)⋅g_t+v_1⋅m^_t(1−v_2)g2_t+v_2⋅v^_t+ϵ],∀t\theta\_{t+1, i} = \theta\_{t, i} - \eta\left[\frac{\left(1-v\_{1}\right)\cdot{g\_{t}} + v\_{1}\cdot\hat{m}\_{t}}{\sqrt{\left(1-v\_{2}\right)g^{2}\_{t} + v\_{2}\cdot{\hat{v}\_{t}}} + \epsilon}\right], \forall{t}θ_t+1,i=θ_t,i−η[(1−v_2)g2_t+v_2⋅v^_t​+ϵ(1−v_1)⋅g_t+v_1⋅m^_t​],∀t

It is recommended to set v_2=1v\_{2} = 1v_2=1 and β_2\beta\_{2}β_2 same as in Adam.

Papers Using This Method

Quasi-hyperbolic momentum and Adam for deep learning2018-10-16