Linear Warmup
Description
Linear Warmup is a learning rate schedule where we linearly increase the learning rate from a low rate to a constant rate thereafter. This reduces volatility in the early stages of training.
Image Credit: Chengwei Zhang
Papers Using This Method
The Warmup Dilemma: How Learning Rate Strategies Impact Speech-to-Text Model Convergence2025-05-29Sample and Computationally Efficient Continuous-Time Reinforcement Learning with General Function Approximation2025-05-20Tractable Representations for Convergent Approximation of Distributional HJB Equations2025-03-07Integrating LLMs with ITS: Recent Advances, Potentials, Challenges, and Future Directions2025-01-08A Combined Encoder and Transformer Approach for Coherent and High-Quality Text Generation2024-11-19Causal Temporal Representation Learning with Nonstationary Sparse Transition2024-09-05Machine learning models for daily rainfall forecasting in Northern Tropical Africa using tropical wave predictors2024-08-29CTRL: Continuous-Time Representation Learning on Temporal Heterogeneous Information Network2024-05-11Towards Adversarial Robustness And Backdoor Mitigation in SSL2024-03-23Two Trades is not Baffled: Condensing Graph via Crafting Rational Gradient Matching2024-02-07Continual Pre-Training of Large Language Models: How to (re)warm your model?2023-08-08CTRL: Connect Collaborative and Language Model for CTR Prediction2023-06-05SweCTRL-Mini: a data-transparent Transformer-based large language model for controllable text generation in Swedish2023-04-27Once Detected, Never Lost: Surpassing Human Performance in Offline LiDAR based 3D Object Detection2023-04-24Elastic Weight Removal for Faithful and Abstractive Dialogue Generation2023-03-30Mixing Backward- with Forward-Chaining for Metacognitive Skill Acquisition and Transfer2023-03-18Alternative formulations for gilthead seabream diets: towards a more sustainable production2022-11-03Controllable Factuality in Document-Grounded Dialog Systems Using a Noisy Channel Model2022-10-31Unsupervised Learning of Structured Representations via Closed-Loop Transcription2022-10-30An Embarrassingly Simple Backdoor Attack on Self-supervised Learning2022-10-13