DoRA: Weight-Decomposed Low-Rank Adaptation

Shih-Yang Liu, Chien-Yi Wang, Hongxu Yin, Pavlo Molchanov, Yu-Chiang Frank Wang, Kwang-Ting Cheng, Min-Hung Chen

2024-02-14parameter-efficient fine-tuning

Paper PDF Code(official)Code Code(official)Code Code

Abstract

Among the widely used parameter-efficient fine-tuning (PEFT) methods, LoRA and its variants have gained considerable popularity because of avoiding additional inference costs. However, there still often exists an accuracy gap between these methods and full fine-tuning (FT). In this work, we first introduce a novel weight decomposition analysis to investigate the inherent differences between FT and LoRA. Aiming to resemble the learning capacity of FT from the findings, we propose Weight-Decomposed Low-Rank Adaptation (DoRA). DoRA decomposes the pre-trained weight into two components, magnitude and direction, for fine-tuning, specifically employing LoRA for directional updates to efficiently minimize the number of trainable parameters. By employing \ours, we enhance both the learning capacity and training stability of LoRA while avoiding any additional inference overhead. \ours~consistently outperforms LoRA on fine-tuning LLaMA, LLaVA, and VL-BART on various downstream tasks, such as commonsense reasoning, visual instruction tuning, and image/video-text understanding. Code is available at https://github.com/NVlabs/DoRA.

Results

Task	Dataset	Metric	Value	Model
parameter-efficient fine-tuning	HellaSwag	Accuracy (% )	76.27	LLaMA2-7b
parameter-efficient fine-tuning	BoolQ	Accuracy (% )	81.93	LLaMA2-7b
parameter-efficient fine-tuning	WinoGrande	Accuracy (% )	70.09	LLaMA2-7b

Related Papers

Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy2025-07-17 LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization2025-07-06 Exploring Adapter Design Tradeoffs for Low Resource Music Generation2025-06-26 WordCon: Word-level Typography Control in Scene Text Rendering2025-06-26 Optimising Language Models for Downstream Tasks: A Post-Training Perspective2025-06-26 Progtuning: Progressive Fine-tuning Framework for Transformer-based Language Models2025-06-26 Detecting Referring Expressions in Visually Grounded Dialogue with Autoregressive Language Models2025-06-26 ARD-LoRA: Dynamic Rank Allocation for Parameter-Efficient Fine-Tuning of Foundation Models with Heterogeneous Adaptation Needs2025-06-23