TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Mixture-of-Subspaces in Low-Rank Adaptation

Mixture-of-Subspaces in Low-Rank Adaptation

Taiqiang Wu, Jiahao Wang, Zhe Zhao, Ngai Wong

2024-06-16Question AnsweringText-to-Image GenerationSentence CompletionCommon Sense ReasoningText to Image GenerationImage GenerationVisual Question Answering
PaperPDFCode(official)

Abstract

In this paper, we introduce a subspace-inspired Low-Rank Adaptation (LoRA) method, which is computationally efficient, easy to implement, and readily applicable to large language, multimodal, and diffusion models. Initially, we equivalently decompose the weights of LoRA into two subspaces, and find that simply mixing them can enhance performance. To study such a phenomenon, we revisit it through a fine-grained subspace lens, showing that such modification is equivalent to employing a fixed mixer to fuse the subspaces. To be more flexible, we jointly learn the mixer with the original LoRA weights, and term the method Mixture-of-Subspaces LoRA (MoSLoRA). MoSLoRA consistently outperforms LoRA on tasks in different modalities, including commonsense reasoning, visual instruction tuning, and subject-driven text-to-image generation, demonstrating its effectiveness and robustness. Codes are available at https://github.com/wutaiqiang/MoSLoRA.

Results

TaskDatasetMetricValueModel
Question AnsweringSIQAAccuracy81LLaMA-3 8B+MoSLoRA (fine-tuned)
Question AnsweringPIQAAccuracy89.7LLaMA3 8B+MoSLoRA
Question AnsweringBoolQAccuracy74.6LLaMA3+MoSLoRA
Question AnsweringOpenBookQAAccuracy86.8LLaMA-3 8B+MoSLoRA
Visual Question Answering (VQA)MMBenchGPT-3.5 score73.8LLaVA-InternLM2-ViT + MoSLoRA
Visual Question Answering (VQA)MMBenchGPT-3.5 score73LLaVA-LLaMA3-8B-ViT + MoSLoRA
Visual Question Answering (VQA)MM-VetGPT-4 score35.2LLaVA-InternLM2-7B-ViT + MoSLoRA
Visual Question Answering (VQA)MM-VetGPT-4 score35.2InternLM2+ViT (QMoSLoRA)
Common Sense ReasoningWinoGrandeAccuracy85.8LLaMA3 8B+MoSLoRA
Common Sense ReasoningARC (Challenge)Accuracy81.5LLaMA 3 8B + MoSLoRA (fine-tuned)
Common Sense ReasoningARC (Easy)Accuracy90.5LLaMA 3 8B+MoSLoRA (fine-tuned)
Sentence CompletionHellaSwagAccuracy95LLaMA3+MoSLoRA
Visual Question AnsweringMMBenchGPT-3.5 score73.8LLaVA-InternLM2-ViT + MoSLoRA
Visual Question AnsweringMMBenchGPT-3.5 score73LLaVA-LLaMA3-8B-ViT + MoSLoRA
Visual Question AnsweringMM-VetGPT-4 score35.2LLaVA-InternLM2-7B-ViT + MoSLoRA
Visual Question AnsweringMM-VetGPT-4 score35.2InternLM2+ViT (QMoSLoRA)

Related Papers

From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17Comparing Apples to Oranges: A Dataset & Analysis of LLM Humour Understanding from Traditional Puns to Topical Jokes2025-07-17fastWDM3D: Fast and Accurate 3D Healthy Tissue Inpainting2025-07-17Synthesizing Reality: Leveraging the Generative AI-Powered Platform Midjourney for Construction Worker Detection2025-07-17FashionPose: Text to Pose to Relight Image Generation for Personalized Fashion Visualization2025-07-17