Zhizhong Li, Derek Hoiem
When building a unified vision system or gradually adding new capabilities to a system, the usual assumption is that training data for all tasks is always available. However, as the number of tasks grows, storing and retraining on such data becomes infeasible. A new problem arises where we add new capabilities to a Convolutional Neural Network (CNN), but the training data for its existing capabilities are unavailable. We propose our Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities. Our method performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques and performs similarly to multitask learning that uses original task data we assume unavailable. A more surprising observation is that Learning without Forgetting may be able to replace fine-tuning with similar old and new task datasets for improved new task performance.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | PASCAL VOC 2012 | mIoU | 4.8 | LWF |
| Semantic Segmentation | PASCAL VOC 2012 | Mean IoU (val) | 55 | LWF |
| Semantic Segmentation | PASCAL VOC 2012 | mIoU | 5.5 | LWF |
| Semantic Segmentation | PASCAL VOC 2012 | mIoU | 5.3 | LWF |
| Semantic Segmentation | PASCAL VOC 2012 | Mean IoU | 54.9 | LWF |
| Semantic Segmentation | PASCAL VOC 2012 | mIoU | 4.3 | LWF |
| Continual Learning | visual domain decathlon (10 tasks) | Avg. Accuracy | 76.93 | LwF |
| Continual Learning | visual domain decathlon (10 tasks) | decathlon discipline (Score) | 2515 | LwF |
| Continual Learning | PASCAL VOC 2012 | mIoU | 4.8 | LWF |
| Continual Learning | PASCAL VOC 2012 | Mean IoU (val) | 55 | LWF |
| Continual Learning | PASCAL VOC 2012 | mIoU | 5.5 | LWF |
| Continual Learning | PASCAL VOC 2012 | mIoU | 5.3 | LWF |
| Continual Learning | PASCAL VOC 2012 | Mean IoU | 54.9 | LWF |
| Continual Learning | PASCAL VOC 2012 | mIoU | 4.3 | LWF |
| 2D Semantic Segmentation | PASCAL VOC 2012 | mIoU | 5.5 | LWF |
| 2D Semantic Segmentation | PASCAL VOC 2012 | mIoU | 5.3 | LWF |
| 2D Semantic Segmentation | PASCAL VOC 2012 | Mean IoU | 54.9 | LWF |
| 2D Semantic Segmentation | PASCAL VOC 2012 | mIoU | 4.3 | LWF |
| Class Incremental Learning | PASCAL VOC 2012 | mIoU | 4.8 | LWF |
| Class Incremental Learning | PASCAL VOC 2012 | Mean IoU (val) | 55 | LWF |
| Class Incremental Learning | PASCAL VOC 2012 | mIoU | 5.5 | LWF |
| Class Incremental Learning | PASCAL VOC 2012 | mIoU | 5.3 | LWF |
| Class Incremental Learning | PASCAL VOC 2012 | Mean IoU | 54.9 | LWF |
| Class Incremental Learning | PASCAL VOC 2012 | mIoU | 4.3 | LWF |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | mIoU | 4.8 | LWF |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | Mean IoU (val) | 55 | LWF |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | mIoU | 5.5 | LWF |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | mIoU | 5.3 | LWF |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | Mean IoU | 54.9 | LWF |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | mIoU | 4.3 | LWF |
| 10-shot image generation | PASCAL VOC 2012 | mIoU | 4.8 | LWF |
| 10-shot image generation | PASCAL VOC 2012 | Mean IoU (val) | 55 | LWF |
| 10-shot image generation | PASCAL VOC 2012 | mIoU | 5.5 | LWF |
| 10-shot image generation | PASCAL VOC 2012 | mIoU | 5.3 | LWF |
| 10-shot image generation | PASCAL VOC 2012 | Mean IoU | 54.9 | LWF |
| 10-shot image generation | PASCAL VOC 2012 | mIoU | 4.3 | LWF |