TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/DyTox: Transformers for Continual Learning with DYnamic TO...

DyTox: Transformers for Continual Learning with DYnamic TOken eXpansion

Arthur Douillard, Alexandre Ramé, Guillaume Couairon, Matthieu Cord

2021-11-22CVPR 2022 1Continual LearningClass Incremental LearningIncremental Learning
PaperPDFCode(official)

Abstract

Deep network architectures struggle to continually learn new tasks without forgetting the previous tasks. A recent trend indicates that dynamic architectures based on an expansion of the parameters can reduce catastrophic forgetting efficiently in continual learning. However, existing approaches often require a task identifier at test-time, need complex tuning to balance the growing number of parameters, and barely share any information across tasks. As a result, they struggle to scale to a large number of tasks without significant overhead. In this paper, we propose a transformer architecture based on a dedicated encoder/decoder framework. Critically, the encoder and decoder are shared among all tasks. Through a dynamic expansion of special tokens, we specialize each forward of our decoder network on a task distribution. Our strategy scales to a large number of tasks while having negligible memory and time overheads due to strict control of the parameters expansion. Moreover, this efficient strategy doesn't need any hyperparameter tuning to control the network's expansion. Our model reaches excellent results on CIFAR100 and state-of-the-art performances on the large-scale ImageNet100 and ImageNet1000 while having less parameters than concurrent dynamic frameworks.

Results

TaskDatasetMetricValueModel
Incremental LearningImageNet - 10 steps# M Params11.36DyTox
Incremental LearningImageNet - 10 stepsAverage Incremental Accuracy71.29DyTox
Incremental LearningImageNet - 10 stepsAverage Incremental Accuracy Top-588.59DyTox
Incremental LearningImageNet - 10 stepsFinal Accuracy63.34DyTox
Incremental LearningImageNet - 10 stepsFinal Accuracy Top-584.49DyTox
Incremental LearningImageNet100 - 10 steps# M Params11.01DyTox
Incremental LearningImageNet100 - 10 stepsAverage Incremental Accuracy77.15DyTox
Incremental LearningImageNet100 - 10 stepsAverage Incremental Accuracy Top-592.04DyTox
Incremental LearningImageNet100 - 10 stepsFinal Accuracy69.1DyTox
Incremental LearningImageNet100 - 10 stepsFinal Accuracy Top-587.98DyTox

Related Papers

RegCL: Continual Adaptation of Segment Anything Model via Model Merging2025-07-16Information-Theoretic Generalization Bounds of Replay-based Continual Learning2025-07-16PROL : Rehearsal Free Continual Learning in Streaming Data via Prompt Online Learning2025-07-16Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime2025-07-15A Neural Network Model of Complementary Learning Systems: Pattern Separation and Completion for Continual Learning2025-07-15LifelongPR: Lifelong knowledge fusion for point cloud place recognition based on replay and prompt learning2025-07-14Overcoming catastrophic forgetting in neural networks2025-07-14Continual Reinforcement Learning by Planning with Online World Models2025-07-12