TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/A Statistical Framework for Low-bitwidth Training of Deep ...

A Statistical Framework for Low-bitwidth Training of Deep Neural Networks

Jianfei Chen, Yu Gai, Zhewei Yao, Michael W. Mahoney, Joseph E. Gonzalez

2020-10-27NeurIPS 2020 12Sentiment AnalysisQuantizationNatural Language InferenceSemantic Textual SimilarityLinguistic Acceptability
PaperPDFCode(official)Code

Abstract

Fully quantized training (FQT), which uses low-bitwidth hardware by quantizing the activations, weights, and gradients of a neural network model, is a promising approach to accelerate the training of deep neural networks. One major challenge with FQT is the lack of theoretical understanding, in particular of how gradient quantization impacts convergence properties. In this paper, we address this problem by presenting a statistical framework for analyzing FQT algorithms. We view the quantized gradient of FQT as a stochastic estimator of its full precision counterpart, a procedure known as quantization-aware training (QAT). We show that the FQT gradient is an unbiased estimator of the QAT gradient, and we discuss the impact of gradient quantization on its variance. Inspired by these theoretical results, we develop two novel gradient quantizers, and we show that these have smaller variance than the existing per-tensor quantizer. For training ResNet-50 on ImageNet, our 5-bit block Householder quantizer achieves only 0.5% validation accuracy loss relative to QAT, comparable to the existing INT8 baseline.

Results

TaskDatasetMetricValueModel
Natural Language InferenceQNLIAccuracy94.5PSQ (Chen et al., 2020)
Natural Language InferenceRTEAccuracy86.8PSQ (Chen et al., 2020)
Natural Language InferenceMultiNLIMatched89.9PSQ (Chen et al., 2020)
Semantic Textual SimilarityMRPCAccuracy90.4PSQ (Chen et al., 2020)
Semantic Textual SimilaritySTS BenchmarkPearson Correlation0.919PSQ (Chen et al., 2020)
Sentiment AnalysisSST-2 Binary classificationAccuracy96.2PSQ (Chen et al., 2020)
Linguistic AcceptabilityCoLAAccuracy67.5PSQ (Chen et al., 2020)

Related Papers

Efficient Deployment of Spiking Neural Networks on SpiNNaker2 for DVS Gesture Recognition Using Neuromorphic Intermediate Representation2025-09-04An End-to-End DNN Inference Framework for the SpiNNaker2 Neuromorphic MPSoC2025-07-18AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17Angle Estimation of a Single Source with Massive Uniform Circular Arrays2025-07-17SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts2025-07-17AI Wizards at CheckThat! 2025: Enhancing Transformer-Based Embeddings with Sentiment for Subjectivity Detection in News Articles2025-07-15DCR: Quantifying Data Contamination in LLMs Evaluation2025-07-15