PowerSGD

GeneralIntroduced 20003 papers

Description

PowerSGD is a distributed optimization technique that computes a low-rank approximation of the gradient using a generalized power iteration (known as subspace iteration). The approximation is computationally light-weight, avoiding any prohibitively expensive Singular Value Decomposition. To improve the quality of the efficient approximation, the authors warm-start the power iteration by reusing the approximation from the previous optimization step.

Papers Using This Method

Practical Low-Rank Communication Compression in Decentralized Deep Learning2020-12-01 PowerGossip: Practical Low-Rank Communication Compression in Decentralized Deep Learning2020-08-04 PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization2019-05-31