TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/Demon CM

Demon CM

GeneralIntroduced 20001 papers
Source Paper

Description

Demon CM, or SGD with Momentum and Demon, is the Demon momentum rule applied to SGD with momentum.

β_t=β_init⋅(1−tT)(1−β_init)+β_init(1−tT)\beta\_{t} = \beta\_{init}\cdot\frac{\left(1-\frac{t}{T}\right)}{\left(1-\beta\_{init}\right) + \beta\_{init}\left(1-\frac{t}{T}\right)}β_t=β_init⋅(1−β_init)+β_init(1−Tt​)(1−Tt​)​

θ_t+1=θ_t−ηg_t+β_tv_t\theta\_{t+1} = \theta\_{t} - \eta{g}\_{t} + \beta\_{t}v\_{t}θ_t+1=θ_t−ηg_t+β_tv_t

v_t+1=β_tv_t−ηg_tv\_{t+1} = \beta\_{t}{v\_{t}} - \eta{g\_{t}}v_t+1=β_tv_t−ηg_t

Papers Using This Method

Demon: Improved Neural Network Training with Momentum Decay2019-10-11