MoE

Mixture of Experts

GeneralIntroduced 2000366 papers

Papers Using This Method

Mixture of Experts in Large Language Models2025-07-15MoFE-Time: Mixture of Frequency Domain Experts for Time-Series Forecasting Models2025-07-09Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis2025-07-08Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate2025-07-08Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts2025-07-07Sub-MoE: Efficient Mixture-of-Expert LLMs Compression via Subspace Expert Merging2025-06-29Latent Prototype Routing: Achieving Near-Perfect Load Balancing in Mixture-of-Experts2025-06-26SAFEx: Analyzing Vulnerabilities of MoE-Based LLMs via Stable Safety-critical Expert Identification2025-06-20Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs2025-06-17Less is More: Undertraining Experts Improves Model Upcycling2025-06-17Utility-Driven Speculative Decoding for Mixture-of-Experts2025-06-17LoRA-Mixer: Coordinate Modular LoRA Experts Through Serial Attention Routing2025-06-17EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization2025-06-16Serving Large Language Models on Huawei CloudMatrix3842025-06-15Ming-Omni: A Unified Multimodal Model for Perception and Generation2025-06-11MoE-GPS: Guidlines for Prediction Strategy for Dynamic Expert Duplication in MoE Load Balancing2025-06-09M2Restore: Mixture-of-Experts-based Mamba-CNN Fusion Framework for All-in-One Image Restoration2025-06-09SMAR: Soft Modality-Aware Routing Strategy for MoE-based Multimodal Large Language Models Preserving Language Capabilities2025-06-06FlashDMoE: Fast Distributed MoE in a Single Kernel2025-06-05Out-of-Distribution Graph Models Merging2025-06-04