Description
ADAHESSIAN is a new stochastic optimization algorithm that directly incorporates approximate curvature information from the loss function, and it includes several novel performance-improving features, including a fast Hutchinson based method to approximate the curvature matrix with low computational overhead.
Papers Using This Method
A Dynamic Weighting Strategy to Mitigate Worker Node Failure in Distributed Deep Learning2024-09-14SOFIM: Stochastic Optimization Using Regularized Fisher Information Matrix2024-03-05SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms2023-12-28On Scaled Methods for Saddle Point Problems2022-06-16Adaptive Optimizers with Sparse Group Lasso for Neural Networks in CTR Prediction2021-07-30ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning2020-06-01