Overcoming catastrophic forgetting with hard attention to the task

Joan Serrà, Dídac Surís, Marius Miron, Alexandros Karatzoglou

2018-01-04ICML 2018 7Continual Learning

Abstract

Catastrophic forgetting occurs when a neural network loses the information learned in a previous task after training on subsequent tasks. This problem remains a hurdle for artificial intelligence systems with sequential learning capabilities. In this paper, we propose a task-based hard attention mechanism that preserves previous tasks' information without affecting the current task's learning. A hard attention mask is learned concurrently to every task, through stochastic gradient descent, and previous masks are exploited to condition such learning. We show that the proposed mechanism is effective for reducing catastrophic forgetting, cutting current rates by 45 to 80%. We also show that it is robust to different hyperparameter choices, and that it offers a number of monitoring capabilities. The approach features the possibility to control both the stability and compactness of the learned knowledge, which we believe makes it also attractive for online learning or network compression applications.

Results

Task	Dataset	Metric	Value	Model
Continual Learning	20Newsgroup (10 tasks)	F1 - macro	0.9521	HAT
Continual Learning	F-CelebA (10 tasks)	Acc	0.5673	HAT
Continual Learning	ASC (19 tasks)	F1 - macro	0.7816	HAT
Continual Learning	DSC (10 tasks)	F1 - macro	0.8614	HAT

Related Papers

RegCL: Continual Adaptation of Segment Anything Model via Model Merging2025-07-16 Information-Theoretic Generalization Bounds of Replay-based Continual Learning2025-07-16 PROL : Rehearsal Free Continual Learning in Streaming Data via Prompt Online Learning2025-07-16 Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime2025-07-15 A Neural Network Model of Complementary Learning Systems: Pattern Separation and Completion for Continual Learning2025-07-15 LifelongPR: Lifelong knowledge fusion for point cloud place recognition based on replay and prompt learning2025-07-14 Overcoming catastrophic forgetting in neural networks2025-07-14 Continual Reinforcement Learning by Planning with Online World Models2025-07-12