TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/QHM

QHM

GeneralIntroduced 20002 papers
Source Paper

Description

Quasi-Hyperbolic Momentum (QHM) is a stochastic optimization technique that alters momentum SGD with a momentum step, averaging an SGD step with a momentum step:

g_t+1=βg_t+(1−β)⋅∇L^_t(θ_t)g\_{t+1} = \beta{g\_{t}} + \left(1-\beta\right)\cdot{\nabla}\hat{L}\_{t}\left(\theta\_{t}\right)g_t+1=βg_t+(1−β)⋅∇L^_t(θ_t) θ_t+1=θ_t−α[(1−v)⋅∇L^_t(θ_t)+v⋅g_t+1] \theta\_{t+1} = \theta\_{t} - \alpha\left[\left(1-v\right)\cdot\nabla\hat{L}\_{t}\left(\theta\_{t}\right) + v\cdot{g\_{t+1}}\right]θ_t+1=θ_t−α[(1−v)⋅∇L^_t(θ_t)+v⋅g_t+1]

The authors suggest a rule of thumb of v=0.7v = 0.7v=0.7 and β=0.999\beta = 0.999β=0.999.

Papers Using This Method

Understanding the Role of Momentum in Stochastic Gradient Methods2019-10-30Quasi-hyperbolic momentum and Adam for deep learning2018-10-16