Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

LAMB

GeneralIntroduced 2000199 papers

Description

LAMB is a a layerwise adaptive large batch optimization technique. It provides a strategy for adapting the learning rate in large batch settings. LAMB uses Adam as the base algorithm and then forms an update as:

$r\_{t} = \frac{m\_{t}}{\sqrt{v\_{t}} + \epsilon}$ $x\_{t+1}^{\left(i\right)} = x\_{t}^{\left(i\right)} - \eta\_{t}\frac{\phi\left(|| x\_{t}^{\left(i\right)} ||\right)}{|| m\_{t}^{\left(i\right)} || }\left(r\_{t}^{\left(i\right)}+\lambda{x\_{t}^{\left(i\right)}}\right)$

Unlike LARS, the adaptivity of LAMB is two-fold: (i) per dimension normalization with respect to the square root of the second moment used in Adam and (ii) layerwise normalization obtained due to layerwise adaptivity.

Papers Using This Method

ALBERT: Advanced Localization and Bidirectional Encoder Representations from Transformers for Automotive Damage Evaluation2025-06-12 Rapid yet accurate Tile-circuit and device modeling for Analog In-Memory Computing2025-05-05 Don't Fight Hallucinations, Use Them: Estimating Image Realism using NLI over Atomic Facts2025-03-20 Efficient or Powerful? Trade-offs Between Machine Learning and Deep Learning for Mental Illness Detection on Social Media2025-03-03 Robust Bias Detection in MLMs and its Application to Human Trait Ratings2025-02-21 Meursault as a Data Point2025-02-03 Aligning Brain Activity with Advanced Transformer Models: Exploring the Role of Punctuation in Semantic Processing2025-01-10 TradingAgents: Multi-Agents LLM Financial Trading Framework2024-12-28 A Comparative Analysis of Transformer and LSTM Models for Detecting Suicidal Ideation on Reddit2024-11-23 BERT-Based Approach for Automating Course Articulation Matrix Construction with Explainable AI2024-11-21 ProTransformer: Robustify Transformers via Plug-and-Play Paradigm2024-10-30 A Bayesian Perspective on the Maximum Score Problem2024-10-22 Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning2024-09-27 Profiling Patient Transcript Using Large Language Model Reasoning Augmentation for Alzheimer's Disease Detection2024-09-19 BioMNER: A Dataset for Biomedical Method Entity Recognition2024-06-28 Concept Formation and Alignment in Language Models: Bridging Statistical Patterns in Latent Space to Concept Taxonomy2024-06-08 Effect of antibody levels on the spread of disease in multiple infections2024-05-31 CEEBERT: Cross-Domain Inference in Early Exit BERT2024-05-23 A Named Entity Recognition and Topic Modeling-based Solution for Locating and Better Assessment of Natural Disasters in Social Media2024-05-01 Exploring Internal Numeracy in Language Models: A Case Study on ALBERT2024-04-25