TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Methods/ULMFiT

ULMFiT

Universal Language Model Fine-tuning

Natural Language ProcessingIntroduced 200040 papers
Source Paper

Description

Universal Language Model Fine-tuning, or ULMFiT, is an architecture and transfer learning method that can be applied to NLP tasks. It involves a 3-layer AWD-LSTM architecture for its representations. The training consists of three steps: 1) general language model pre-training on a Wikipedia-based text, 2) fine-tuning the language model on a target task, and 3) fine-tuning the classifier on the target task.

As different layers capture different types of information, they are fine-tuned to different extents using discriminative fine-tuning. Training is performed using Slanted triangular learning rates (STLR), a learning rate scheduling strategy that first linearly increases the learning rate and then linearly decays it.

Fine-tuning the target classifier is achieved in ULMFiT using gradual unfreezing. Rather than fine-tuning all layers at once, which risks catastrophic forgetting, ULMFiT gradually unfreezes the model starting from the last layer (i.e., closest to the output) as this contains the least general knowledge. First the last layer is unfrozen and all unfrozen layers are fine-tuned for one epoch. Then the next group of frozen layers is unfrozen and fine-tuned and repeat, until all layers are fine-tuned until convergence at the last iteration.

Papers Using This Method

Advanced Deep Learning Techniques for Analyzing Earnings Call Transcripts: Methodologies and Applications2025-02-27No Argument Left Behind: Overlapping Chunks for Faster Processing of Arbitrarily Long Legal Texts2024-10-24RICo: Reddit ideological communities2024-06-05Exploring Multi-Level Threats in Telegram Data with AI-Human Annotation: A Preliminary Study2023-12-15Illicit Darkweb Classification via Natural-language Processing: Classifying Illicit Content of Webpages based on Textual Information2023-12-08Explainable and High-Performance Hate and Offensive Speech Detection2022-06-26IIITT@Dravidian-CodeMix-FIRE2021: Transliterate or translate? Sentiment analysis of code-mixed text in Dravidian languages2021-11-15Offensive Language Identification in Low-resourced Code-mixed Dravidian languages using Pseudo-labeling2021-08-27Towards Offensive Language Identification for Tamil Code-Mixed YouTube Comments and Posts2021-08-24Learning ULMFiT and Self-Distillation with Calibration for Medical Dialogue System2021-07-20WHOSe Heritage: Classification of UNESCO World Heritage "Outstanding Universal Value" Documents with Soft Labels2021-04-12L3CubeMahaSent: A Marathi Tweet-based Sentiment Analysis Dataset2021-03-21Experimental Evaluation of Deep Learning models for Marathi Text Classification2021-01-13LaDiff ULMFiT: A Layer Differentiated training approach for ULMFiT2021-01-13HinglishNLP at SemEval-2020 Task 9: Fine-tuned Language Models for Hinglish Sentiment Detection2020-12-01Smash at SemEval-2020 Task 7: Optimizing the Hyperparameters of ERNIE 2.0 for Humor Ranking and Rating2020-12-01Palomino-Ochoa at SemEval-2020 Task 9: Robust System based on Transformer for Code-Mixed Sentiment Classification2020-11-18Pagsusuri ng RNN-based Transfer Learning Technique sa Low-Resource Language2020-10-13Gauravarora@HASOC-Dravidian-CodeMix-FIRE2020: Pre-training ULMFiT on Synthetically Generated Code-Mixed Data for Hate Speech Detection2020-10-05FarsTail: A Persian Natural Language Inference Dataset2020-09-18