MoE

Mixture of Experts

GeneralIntroduced 2000366 papers

Papers Using This Method

Mixture of Experts in Large Language Models2025-07-15 MoFE-Time: Mixture of Frequency Domain Experts for Time-Series Forecasting Models2025-07-09 Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis2025-07-08 Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate2025-07-08 Learning Robust Stereo Matching in the Wild with Selective Mixture-of-Experts2025-07-07 Sub-MoE: Efficient Mixture-of-Expert LLMs Compression via Subspace Expert Merging2025-06-29 Latent Prototype Routing: Achieving Near-Perfect Load Balancing in Mixture-of-Experts2025-06-26 SAFEx: Analyzing Vulnerabilities of MoE-Based LLMs via Stable Safety-critical Expert Identification2025-06-20 Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs2025-06-17 Less is More: Undertraining Experts Improves Model Upcycling2025-06-17 Utility-Driven Speculative Decoding for Mixture-of-Experts2025-06-17 LoRA-Mixer: Coordinate Modular LoRA Experts Through Serial Attention Routing2025-06-17 EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization2025-06-16 Serving Large Language Models on Huawei CloudMatrix3842025-06-15 Ming-Omni: A Unified Multimodal Model for Perception and Generation2025-06-11 MoE-GPS: Guidlines for Prediction Strategy for Dynamic Expert Duplication in MoE Load Balancing2025-06-09 M2Restore: Mixture-of-Experts-based Mamba-CNN Fusion Framework for All-in-One Image Restoration2025-06-09 SMAR: Soft Modality-Aware Routing Strategy for MoE-based Multimodal Large Language Models Preserving Language Capabilities2025-06-06 FlashDMoE: Fast Distributed MoE in a Single Kernel2025-06-05 Out-of-Distribution Graph Models Merging2025-06-04