TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/AMSBound

AMSBound

GeneralIntroduced 20001 papers
Source Paper

Description

AMSBound is a variant of the AMSGrad stochastic optimizer which is designed to be more robust to extreme learning rates. Dynamic bounds are employed on learning rates, where the lower and upper bound are initialized as zero and infinity respectively, and they both smoothly converge to a constant final step size. AMSBound can be regarded as an adaptive method at the beginning of training, and it gradually and smoothly transforms to SGD (or with momentum) as time step increases.

g_t=∇f_t(x_t)g\_{t} = \nabla{f}\_{t}\left(x\_{t}\right)g_t=∇f_t(x_t)

m_t=β_1tm_t−1+(1−β_1t)g_tm\_{t} = \beta\_{1t}m\_{t-1} + \left(1-\beta\_{1t}\right)g\_{t}m_t=β_1tm_t−1+(1−β_1t)g_t

v_t=β_2v_t−1+(1−β_2)g_t2 v\_{t} = \beta\_{2}v\_{t-1} + \left(1-\beta\_{2}\right)g\_{t}^{2}v_t=β_2v_t−1+(1−β_2)g_t2

v^_t=max⁡(v^_t−1,v_t) and V_t=diag(v^_t)\hat{v}\_{t} = \max\left(\hat{v}\_{t-1}, v\_{t}\right) \text{ and } V\_{t} = \text{diag}\left(\hat{v}\_{t}\right)v^_t=max(v^_t−1,v_t) and V_t=diag(v^_t)

η=Clip(α/V_t,η_l(t),η_u(t)) and η_t=η/t\eta = \text{Clip}\left(\alpha/\sqrt{V\_{t}}, \eta\_{l}\left(t\right), \eta\_{u}\left(t\right)\right) \text{ and } \eta\_{t} = \eta/\sqrt{t}η=Clip(α/V_t​,η_l(t),η_u(t)) and η_t=η/t​

x_t+1=Π_F,diag(η_t−1)(x_t−η_t⊙m_t)x\_{t+1} = \Pi\_{\mathcal{F}, \text{diag}\left(\eta\_{t}^{-1}\right)}\left(x\_{t} - \eta\_{t} \odot m\_{t} \right)x_t+1=Π_F,diag(η_t−1)(x_t−η_t⊙m_t)

Where α\alphaα is the initial step size, and ηl\eta_{l}ηl​ and ηu\eta_{u}ηu​ are the lower and upper bound functions respectively.

Papers Using This Method

Adaptive Gradient Methods with Dynamic Bound of Learning Rate2019-02-26