Arthur Douillard, Yifu Chen, Arnaud Dapogny, Matthieu Cord
Deep learning approaches are nowadays ubiquitously used to tackle computer vision tasks such as semantic segmentation, requiring large datasets and substantial computational power. Continual learning for semantic segmentation (CSS) is an emerging trend that consists in updating an old model by sequentially adding new classes. However, continual learning methods are usually prone to catastrophic forgetting. This issue is further aggravated in CSS where, at each step, old classes from previous iterations are collapsed into the background. In this paper, we propose Local POD, a multi-scale pooling distillation scheme that preserves long- and short-range spatial relationships at feature level. Furthermore, we design an entropy-based pseudo-labelling of the background w.r.t. classes predicted by the old model to deal with background shift and avoid catastrophic forgetting of the old classes. Our approach, called PLOP, significantly outperforms state-of-the-art methods in existing CSS scenarios, as well as in newly proposed challenging benchmarks.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | PASCAL VOC 2012 | mIoU | 30.45 | PLOP |
| Semantic Segmentation | PASCAL VOC 2012 | Mean IoU (val) | 70.09 | PLOP |
| Semantic Segmentation | PASCAL VOC 2012 | Mean IoU (val) | 70.08 | MiB |
| Semantic Segmentation | PASCAL VOC 2012 | mIoU | 54.64 | PLOP |
| Semantic Segmentation | PASCAL VOC 2012 | mIoU | 29.29 | MiB |
| Semantic Segmentation | PASCAL VOC 2012 | mIoU | 46.5 | PLOP |
| Semantic Segmentation | PASCAL VOC 2012 | Mean IoU | 64.3 | PLOP |
| Semantic Segmentation | ADE20K | mIoU | 28.75 | PLOP |
| Semantic Segmentation | ADE20K | mIoU | 25.96 | MiB |
| Semantic Segmentation | PASCAL VOC 2012 | mIoU | 8.4 | PLOP |
| Semantic Segmentation | ADE20K | mIoU | 32.94 | PLOP |
| Semantic Segmentation | ADE20K | mIoU | 32.79 | MiB |
| Semantic Segmentation | ADE20K | mIoU | 30.4 | PLOP |
| Semantic Segmentation | ADE20K | mIoU | 29.31 | MiB |
| Semantic Segmentation | ADE20K | Mean IoU (test) | 31.59 | PLOP |
| Semantic Segmentation | ADE20K | Mean IoU (test) | 29.24 | MiB |
| Continual Learning | PASCAL VOC 2012 | mIoU | 30.45 | PLOP |
| Continual Learning | PASCAL VOC 2012 | Mean IoU (val) | 70.09 | PLOP |
| Continual Learning | PASCAL VOC 2012 | Mean IoU (val) | 70.08 | MiB |
| Continual Learning | PASCAL VOC 2012 | mIoU | 54.64 | PLOP |
| Continual Learning | PASCAL VOC 2012 | mIoU | 29.29 | MiB |
| Continual Learning | PASCAL VOC 2012 | mIoU | 46.5 | PLOP |
| Continual Learning | PASCAL VOC 2012 | Mean IoU | 64.3 | PLOP |
| Continual Learning | ADE20K | mIoU | 28.75 | PLOP |
| Continual Learning | ADE20K | mIoU | 25.96 | MiB |
| Continual Learning | PASCAL VOC 2012 | mIoU | 8.4 | PLOP |
| Continual Learning | ADE20K | mIoU | 32.94 | PLOP |
| Continual Learning | ADE20K | mIoU | 32.79 | MiB |
| Continual Learning | ADE20K | mIoU | 30.4 | PLOP |
| Continual Learning | ADE20K | mIoU | 29.31 | MiB |
| Continual Learning | ADE20K | Mean IoU (test) | 31.59 | PLOP |
| Continual Learning | ADE20K | Mean IoU (test) | 29.24 | MiB |
| 2D Semantic Segmentation | PASCAL VOC 2012 | mIoU | 54.64 | PLOP |
| 2D Semantic Segmentation | PASCAL VOC 2012 | mIoU | 29.29 | MiB |
| 2D Semantic Segmentation | PASCAL VOC 2012 | mIoU | 46.5 | PLOP |
| 2D Semantic Segmentation | PASCAL VOC 2012 | Mean IoU | 64.3 | PLOP |
| 2D Semantic Segmentation | PASCAL VOC 2012 | mIoU | 8.4 | PLOP |
| Class Incremental Learning | PASCAL VOC 2012 | mIoU | 30.45 | PLOP |
| Class Incremental Learning | PASCAL VOC 2012 | Mean IoU (val) | 70.09 | PLOP |
| Class Incremental Learning | PASCAL VOC 2012 | Mean IoU (val) | 70.08 | MiB |
| Class Incremental Learning | PASCAL VOC 2012 | mIoU | 54.64 | PLOP |
| Class Incremental Learning | PASCAL VOC 2012 | mIoU | 29.29 | MiB |
| Class Incremental Learning | PASCAL VOC 2012 | mIoU | 46.5 | PLOP |
| Class Incremental Learning | PASCAL VOC 2012 | Mean IoU | 64.3 | PLOP |
| Class Incremental Learning | ADE20K | mIoU | 28.75 | PLOP |
| Class Incremental Learning | ADE20K | mIoU | 25.96 | MiB |
| Class Incremental Learning | PASCAL VOC 2012 | mIoU | 8.4 | PLOP |
| Class Incremental Learning | ADE20K | mIoU | 32.94 | PLOP |
| Class Incremental Learning | ADE20K | mIoU | 32.79 | MiB |
| Class Incremental Learning | ADE20K | mIoU | 30.4 | PLOP |
| Class Incremental Learning | ADE20K | mIoU | 29.31 | MiB |
| Class Incremental Learning | ADE20K | Mean IoU (test) | 31.59 | PLOP |
| Class Incremental Learning | ADE20K | Mean IoU (test) | 29.24 | MiB |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | mIoU | 30.45 | PLOP |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | Mean IoU (val) | 70.09 | PLOP |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | Mean IoU (val) | 70.08 | MiB |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | mIoU | 54.64 | PLOP |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | mIoU | 29.29 | MiB |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | mIoU | 46.5 | PLOP |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | Mean IoU | 64.3 | PLOP |
| Class-Incremental Semantic Segmentation | ADE20K | mIoU | 28.75 | PLOP |
| Class-Incremental Semantic Segmentation | ADE20K | mIoU | 25.96 | MiB |
| Class-Incremental Semantic Segmentation | PASCAL VOC 2012 | mIoU | 8.4 | PLOP |
| Class-Incremental Semantic Segmentation | ADE20K | mIoU | 32.94 | PLOP |
| Class-Incremental Semantic Segmentation | ADE20K | mIoU | 32.79 | MiB |
| Class-Incremental Semantic Segmentation | ADE20K | mIoU | 30.4 | PLOP |
| Class-Incremental Semantic Segmentation | ADE20K | mIoU | 29.31 | MiB |
| Class-Incremental Semantic Segmentation | ADE20K | Mean IoU (test) | 31.59 | PLOP |
| Class-Incremental Semantic Segmentation | ADE20K | Mean IoU (test) | 29.24 | MiB |
| 10-shot image generation | PASCAL VOC 2012 | mIoU | 30.45 | PLOP |
| 10-shot image generation | PASCAL VOC 2012 | Mean IoU (val) | 70.09 | PLOP |
| 10-shot image generation | PASCAL VOC 2012 | Mean IoU (val) | 70.08 | MiB |
| 10-shot image generation | PASCAL VOC 2012 | mIoU | 54.64 | PLOP |
| 10-shot image generation | PASCAL VOC 2012 | mIoU | 29.29 | MiB |
| 10-shot image generation | PASCAL VOC 2012 | mIoU | 46.5 | PLOP |
| 10-shot image generation | PASCAL VOC 2012 | Mean IoU | 64.3 | PLOP |
| 10-shot image generation | ADE20K | mIoU | 28.75 | PLOP |
| 10-shot image generation | ADE20K | mIoU | 25.96 | MiB |
| 10-shot image generation | PASCAL VOC 2012 | mIoU | 8.4 | PLOP |
| 10-shot image generation | ADE20K | mIoU | 32.94 | PLOP |
| 10-shot image generation | ADE20K | mIoU | 32.79 | MiB |
| 10-shot image generation | ADE20K | mIoU | 30.4 | PLOP |
| 10-shot image generation | ADE20K | mIoU | 29.31 | MiB |
| 10-shot image generation | ADE20K | Mean IoU (test) | 31.59 | PLOP |
| 10-shot image generation | ADE20K | Mean IoU (test) | 29.24 | MiB |