TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Overcoming catastrophic forgetting with hard attention to ...

Overcoming catastrophic forgetting with hard attention to the task

Joan Serrà, Dídac Surís, Marius Miron, Alexandros Karatzoglou

2018-01-04ICML 2018 7Continual Learning
PaperPDFCodeCode(official)

Abstract

Catastrophic forgetting occurs when a neural network loses the information learned in a previous task after training on subsequent tasks. This problem remains a hurdle for artificial intelligence systems with sequential learning capabilities. In this paper, we propose a task-based hard attention mechanism that preserves previous tasks' information without affecting the current task's learning. A hard attention mask is learned concurrently to every task, through stochastic gradient descent, and previous masks are exploited to condition such learning. We show that the proposed mechanism is effective for reducing catastrophic forgetting, cutting current rates by 45 to 80%. We also show that it is robust to different hyperparameter choices, and that it offers a number of monitoring capabilities. The approach features the possibility to control both the stability and compactness of the learned knowledge, which we believe makes it also attractive for online learning or network compression applications.

Results

TaskDatasetMetricValueModel
Continual Learning20Newsgroup (10 tasks)F1 - macro0.9521HAT
Continual LearningF-CelebA (10 tasks)Acc0.5673HAT
Continual LearningASC (19 tasks)F1 - macro0.7816HAT
Continual LearningDSC (10 tasks)F1 - macro0.8614HAT

Related Papers

RegCL: Continual Adaptation of Segment Anything Model via Model Merging2025-07-16Information-Theoretic Generalization Bounds of Replay-based Continual Learning2025-07-16PROL : Rehearsal Free Continual Learning in Streaming Data via Prompt Online Learning2025-07-16Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime2025-07-15A Neural Network Model of Complementary Learning Systems: Pattern Separation and Completion for Continual Learning2025-07-15LifelongPR: Lifelong knowledge fusion for point cloud place recognition based on replay and prompt learning2025-07-14Overcoming catastrophic forgetting in neural networks2025-07-14Continual Reinforcement Learning by Planning with Online World Models2025-07-12