Description
The Lion optimizer is discovered by symbolic program search. It is more memory-efficient than most adaptive optimizers as it only needs to momentum. The update of Lion is produced by the sign function.
Papers Using This Method
Towards Efficient and Effective Alignment of Large Language Models2025-06-11Lions and Muons: Optimization via Stochastic Frank-Wolfe2025-06-04Linear Attention for Efficient Bidirectional Sequence Modeling2025-02-22AdaGC: Improving Training Stability for Large Language Model Pretraining2025-02-16How Memory in Optimization Algorithms Implicitly Modifies the Loss2025-02-04Connections between Schedule-Free Optimizers, AdEMAMix, and Accelerated SGD Variants2025-02-04Grams: Gradient Descent with Adaptive Momentum Scaling2024-12-22Lion Cub: Minimizing Communication Overhead in Distributed Lion2024-11-25MARS: Unleashing the Power of Variance Reduction for Training Large Models2024-11-15Convergence Rate Analysis of LION2024-11-12TalkMosaic: Interactive PhotoMosaic with Multi-modal LLM Q&A Interactions2024-09-20Learning Spatially-Aware Language and Audio Embeddings2024-09-17LION: Linear Group RNN for 3D Object Detection in Point Clouds2024-07-25Deconstructing What Makes a Good Optimizer for Language Models2024-07-10Dynamic Multi-Objective Lion Swarm Optimization with Multi-strategy Fusion: An application in 6R robot trajectory planning2024-05-31Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling2024-05-23Audio-Visual Speech Recognition based on Regulated Transformer and Spatio-Temporal Fusion Strategy for Driver Assistive Systems2024-05-09Communication Efficient Distributed Training with Distributed Lion2024-03-30FedLion: Faster Adaptive Federated Optimization with Fewer Communication2024-02-15MADA: Meta-Adaptive Optimizers through hyper-gradient Descent2024-01-17