Description
Gradient Sparsification is a technique for distributed training that sparsifies stochastic gradients to reduce the communication cost, with minor increase in the number of iterations. The key idea behind our sparsification technique is to drop some coordinates of the stochastic gradient and appropriately amplify the remaining coordinates to ensure the unbiasedness of the sparsified stochastic gradient. The sparsification approach can significantly reduce the coding length of the stochastic gradient and only slightly increase the variance of the stochastic gradient.
Papers Using This Method
Mobility-Aware Asynchronous Federated Learning with Dynamic Sparsification2025-06-08Dynamic Gradient Sparsification Training for Few-Shot Fine-tuning of CT Lymph Node Segmentation Foundation Model2025-03-02Sparse Incremental Aggregation in Satellite Federated Learning2025-01-20Regularized Top-$k$: A Bayesian Framework for Gradient Sparsification2025-01-10DQRM: Deep Quantized Recommendation Models2024-10-26Age-of-Gradient Updates for Federated Learning over Random Access Channels2024-10-15Novel Gradient Sparsification Algorithm via Bayesian Inference2024-09-23Preserving Near-Optimal Gradient Sparsification Cost for Scalable Distributed Deep Learning2024-02-21JointSQ: Joint Sparsification-Quantization for Distributed Learning2024-01-01RS-DGC: Exploring Neighborhood Statistics for Dynamic Gradient Compression on Remote Sensing Image Interpretation2023-12-29MiCRO: Near-Zero Cost Gradient Sparsification for Scaling and Accelerating Distributed DNN Training2023-10-02Gradient Sparsification For Masked Fine-Tuning of Transformers2023-07-19DEFT: Exploiting Gradient Norm Difference between Model Layers for Scalable Gradient Sparsification2023-07-07Gradient Sparsification for Efficient Wireless Federated Learning with Differential Privacy2023-04-09Efficient and Secure Federated Learning for Financial Applications2023-03-15On the Interaction Between Differential Privacy and Gradient Compression in Deep Learning2022-11-01Downlink Compression Improves TopK Sparsification2022-09-30Empirical Analysis on Top-k Gradient Sparsification for Distributed Deep Learning in a Supercomputing Environment2022-09-18Near-Optimal Sparse Allreduce for Distributed Deep Learning2022-01-19Sparsified Secure Aggregation for Privacy-Preserving Federated Learning2021-12-23