TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/HMQ: Hardware Friendly Mixed Precision Quantization Block ...

HMQ: Hardware Friendly Mixed Precision Quantization Block for CNNs

Hai Victor Habi, Roy H. Jennings, Arnon Netzer

2020-07-20ECCV 2020 8Quantization
PaperPDFCode(official)Code

Abstract

Recent work in network quantization produced state-of-the-art results using mixed precision quantization. An imperative requirement for many efficient edge device hardware implementations is that their quantizers are uniform and with power-of-two thresholds. In this work, we introduce the Hardware Friendly Mixed Precision Quantization Block (HMQ) in order to meet this requirement. The HMQ is a mixed precision quantization block that repurposes the Gumbel-Softmax estimator into a smooth estimator of a pair of quantization parameters, namely, bit-width and threshold. HMQs use this to search over a finite space of quantization schemes. Empirically, we apply HMQs to quantize classification models trained on CIFAR10 and ImageNet. For ImageNet, we quantize four different architectures and show that, in spite of the added restrictions to our quantization scheme, we achieve competitive and, in some cases, state-of-the-art results.

Results

TaskDatasetMetricValueModel
QuantizationImageNetActivation bits8EfficientNet-B0-W8A8
QuantizationImageNetTop-1 Accuracy (%)76.4EfficientNet-B0-W8A8
QuantizationImageNetWeight bits8EfficientNet-B0-W8A8
QuantizationImageNetActivation bits4EfficientNet-B0-W4A4
QuantizationImageNetTop-1 Accuracy (%)76EfficientNet-B0-W4A4
QuantizationImageNetWeight bits4EfficientNet-B0-W4A4
QuantizationImageNetActivation bits4ResNet50-W3A4
QuantizationImageNetTop-1 Accuracy (%)75.45ResNet50-W3A4
QuantizationImageNetWeight bits3ResNet50-W3A4
QuantizationImageNetTop-1 Accuracy (%)70.9MobileNetV2

Related Papers

Efficient Deployment of Spiking Neural Networks on SpiNNaker2 for DVS Gesture Recognition Using Neuromorphic Intermediate Representation2025-09-04An End-to-End DNN Inference Framework for the SpiNNaker2 Neuromorphic MPSoC2025-07-18Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17Angle Estimation of a Single Source with Massive Uniform Circular Arrays2025-07-17Quantized Rank Reduction: A Communications-Efficient Federated Learning Scheme for Network-Critical Applications2025-07-15MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization2025-07-14Lightweight Federated Learning over Wireless Edge Networks2025-07-13Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation2025-07-11