Dominik Filipiak, Piotr Tempczyk, Marek Cygan
We present n-CPS - a generalisation of the recent state-of-the-art cross pseudo supervision (CPS) approach for the task of semi-supervised semantic segmentation. In n-CPS, there are n simultaneously trained subnetworks that learn from each other through one-hot encoding perturbation and consistency regularisation. We also show that ensembling techniques applied to subnetworks outputs can significantly improve the performance. To the best of our knowledge, n-CPS paired with CutMix outperforms CPS and sets the new state-of-the-art for Pascal VOC 2012 with (1/16, 1/8, 1/4, and 1/2 supervised regimes) and Cityscapes (1/16 supervised).
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | Pascal VOC 2012 6.25% labeled | Validation mIoU | 75.86 | n-CPS (ResNet-101) |
| Semantic Segmentation | Pascal VOC 2012 6.25% labeled | Validation mIoU | 72.03 | n-CPS (ResNet-50) |
| Semantic Segmentation | PASCAL VOC 2012 25% labeled | Validation mIoU | 78.97 | n-CPS (ResNet-101) |
| Semantic Segmentation | PASCAL VOC 2012 25% labeled | Validation mIoU | 75.85 | n-CPS (ResNet-50) |
| Semantic Segmentation | Cityscapes 6.25% labeled | Validation mIoU | 76.08 | n-CPS (ResNet-50) |
| 10-shot image generation | Pascal VOC 2012 6.25% labeled | Validation mIoU | 75.86 | n-CPS (ResNet-101) |
| 10-shot image generation | Pascal VOC 2012 6.25% labeled | Validation mIoU | 72.03 | n-CPS (ResNet-50) |
| 10-shot image generation | PASCAL VOC 2012 25% labeled | Validation mIoU | 78.97 | n-CPS (ResNet-101) |
| 10-shot image generation | PASCAL VOC 2012 25% labeled | Validation mIoU | 75.85 | n-CPS (ResNet-50) |
| 10-shot image generation | Cityscapes 6.25% labeled | Validation mIoU | 76.08 | n-CPS (ResNet-50) |