TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/SparseGPT: Massive Language Models Can Be Accurately Prune...

SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

Elias Frantar, Dan Alistarh

2023-01-02Question AnsweringQuantizationCommon Sense ReasoningLanguage Modelling
PaperPDFCodeCodeCodeCodeCode(official)Code

Abstract

We show for the first time that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy. This is achieved via a new pruning method called SparseGPT, specifically designed to work efficiently and accurately on massive GPT-family models. We can execute SparseGPT on the largest available open-source models, OPT-175B and BLOOM-176B, in under 4.5 hours, and can reach 60% unstructured sparsity with negligible increase in perplexity: remarkably, more than 100 billion weights from these models can be ignored at inference time. SparseGPT generalizes to semi-structured (2:4 and 4:8) patterns, and is compatible with weight quantization approaches. The code is available at: https://github.com/IST-DASLab/sparsegpt.

Results

TaskDatasetMetricValueModel
Question AnsweringPIQAAccuracy81.07OPT-175B
Question AnsweringPIQAAccuracy80.63SparseGPT 175B (50% Sparsity)
Question AnsweringPIQAAccuracy79.54SparseGPT 175B (4:8 Sparsity)
Question AnsweringPIQAAccuracy79.54SparseGPT 175B (2:4 Sparsity)
Question AnsweringPIQAAccuracy54.73OPT-175B (50% Sparsity)
Question AnsweringStoryClozeAccuracy79.82OPT-175B
Question AnsweringStoryClozeAccuracy78.87SparseGPT (175B, 50% Sparsity)
Question AnsweringStoryClozeAccuracy77.02SparseGPT (175B, 4:8 Sparsity)
Question AnsweringStoryClozeAccuracy76.19SparseGPT (175B, 2:4 Sparsity)
Question AnsweringStoryClozeAccuracy47.1OPT-175B (50% Sparsity)
Common Sense ReasoningARC (Challenge)Accuracy43.94OPT-175B
Common Sense ReasoningARC (Challenge)Accuracy41.3SparseGPT (175B, 50% Sparsity)
Common Sense ReasoningARC (Challenge)Accuracy39.85SparseGPT (175B, 4:8 Sparsity)
Common Sense ReasoningARC (Challenge)Accuracy38.99SparseGPT (175B, 2:4 Sparsity)
Common Sense ReasoningARC (Challenge)Accuracy25.6OPT-175B (50% Sparsity)
Common Sense ReasoningARC (Easy)Accuracy71.04OPT-175B
Common Sense ReasoningARC (Easy)Accuracy69.65SparseGPT 175B (50% sparsity)
Common Sense ReasoningARC (Easy)Accuracy68.35SparseGPT (175B, 4:8 Sparsity)
Common Sense ReasoningARC (Easy)Accuracy67.08SparseGPT 175B (2:4 sparsity)
Common Sense ReasoningARC (Easy)Accuracy28.03OPT 175B (50% Sparsity)
Language ModellingLAMBADAAccuracy79.47SparseGPT (175B, 2:4 Sparsity)
Language ModellingLAMBADAAccuracy78.77SparseGPT (175B, 4:8 Sparsity)
Language ModellingLAMBADAAccuracy76.51SparseGPT (175B, 50% Sparsity)
Language ModellingLAMBADAAccuracy75.59OPT-175B
Language ModellingLAMBADAAccuracy0.02OPT-175B (50% Sparsity)
Language ModellingWikiText-2Test perplexity8.21SparseGPT (175B, 50% Sparsity)
Language ModellingWikiText-2Test perplexity8.34OPT-175B
Language ModellingWikiText-2Test perplexity8.45SparseGPT (175B, 4:8 Sparsity)
Language ModellingWikiText-2Test perplexity8.73SparseGPT (175B, 2:4 Sparsity)
Language ModellingWikiText-2Test perplexity234.77OPT-175B (50% Sparsity)

Related Papers

Efficient Deployment of Spiking Neural Networks on SpiNNaker2 for DVS Gesture Recognition Using Neuromorphic Intermediate Representation2025-09-04Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21An End-to-End DNN Inference Framework for the SpiNNaker2 Neuromorphic MPSoC2025-07-18From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Task-Specific Audio Coding for Machines: Machine-Learned Latent Features Are Codes for That Machine2025-07-17