TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/R2 Loss: Range Restriction Loss for Model Compression and ...

R2 Loss: Range Restriction Loss for Model Compression and Quantization

Arnav Kundu, Chungkuk Yoo, Srijan Mishra, Minsik Cho, Saurabh Adya

2023-03-14QuantizationModel CompressionClassification
PaperPDF

Abstract

Model quantization and compression is widely used techniques to reduce usage of computing resource at inference time. While state-of-the-art works have been achieved reasonable accuracy with higher bit such as 4bit or 8bit, but still it is challenging to quantize/compress a model further, e.g., 1bit or 2bit. To overcome the challenge, we focus on outliers in weights of a pre-trained model which disrupt effective lower bit quantization and compression. In this work, we propose Range Restriction Loss (R2-Loss) for building lower bit quantization and compression friendly models by removing outliers from weights during pre-training. By effectively restricting range of weights, we mold the overall distribution into a tight shape to ensure high quantization bit resolution, therefore allowing model compression and quantization techniques can to utilize their limited numeric representation powers better. We introduce three different, L-inf R2-Loss, its extension Margin R2-Loss and a new Soft-Min-MaxR2-Loss to be used as an auxiliary loss during full-precision model training. These R2-Loss can be used in different cases such as L-inf and Margin R2-Loss would be effective for symmetric quantization, while Soft-Min-Max R2-Loss shows better performance for model compression. In our experiment, R2-Loss improves lower bit quantization accuracy with state-of-the-art post-training quantization (PTQ), quantization-aware training (QAT), and model compression techniques. With R2-Loss, MobileNet-V2 2bit weight and 8bit activation PTQ, MobileNet-V1 2bit weight and activation QAT, ResNet18 1bit weight compression are improved to 59.49% from 50.66%, 59.05% from 55.96%, and 52.58% from 45.54%, respectively.

Results

TaskDatasetMetricValueModel
Model CompressionQNLIAccuracy82.13MobileBERT + 2bit-1dim model compression using DKM
Model CompressionQNLIAccuracy63.17MobileBERT + 1bit-1dim model compression using DKM
Model CompressionImageNetTop-170.52ResNet-18 + 4bit-1dim model compression using DKM
Model CompressionImageNetTop-169.63MobileNet-v1 + 4bit-1dim model compression using DKM
Model CompressionImageNetTop-168.63ResNet-18 + 2bit-1dim model compression using DKM
Model CompressionImageNetTop-167.62MobileNet-v1 + 2bit-1dim model compression using DKM
Model CompressionImageNetTop-166.1ResNet-18 + 4bit-4dim model compression using DKM
Model CompressionImageNetTop-164.7ResNet-18 + 2bit-2dim model compression using DKM
Model CompressionImageNetTop-161.4MobileNet-v1 + 4bit-4dim model compression using DKM
Model CompressionImageNetTop-159.7ResNet-18 + 1bit-1dim model compression using DKM
Model CompressionImageNetTop-153.99MobileNet-v1 + 2bit-2dim model compression using DKM
Model CompressionImageNetTop-152.58MobileNet-v1 + 1bit-1dim model compression using DKM
QuantizationImageNetTop-1 Accuracy (%)69.79MobileNet-v1 + EWGS + R2Loss
QuantizationImageNetWeight bits4MobileNet-v1 + EWGS + R2Loss
QuantizationImageNetTop-1 Accuracy (%)69.64MobileNet-v1 + LSQ + R2Loss
QuantizationImageNetActivation bits4ResNet-18 + PACT + R2Loss
QuantizationImageNetTop-1 Accuracy (%)68.45ResNet-18 + PACT + R2Loss
QuantizationImageNetWeight bits2ResNet-18 + PACT + R2Loss

Related Papers

Efficient Deployment of Spiking Neural Networks on SpiNNaker2 for DVS Gesture Recognition Using Neuromorphic Intermediate Representation2025-09-04LINR-PCGC: Lossless Implicit Neural Representations for Point Cloud Geometry Compression2025-07-21An End-to-End DNN Inference Framework for the SpiNNaker2 Neuromorphic MPSoC2025-07-18Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17Angle Estimation of a Single Source with Massive Uniform Circular Arrays2025-07-17Adversarial attacks to image classification systems using evolutionary algorithms2025-07-17Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation2025-07-16Safeguarding Federated Learning-based Road Condition Classification2025-07-16