Moshe Kimhi, Shai Kimhi, Evgenii Zheltonozhskii, Or Litany, Chaim Baskin
We present a novel confidence refinement scheme that enhances pseudo labels in semi-supervised semantic segmentation. Unlike existing methods, which filter pixels with low-confidence predictions in isolation, our approach leverages the spatial correlation of labels in segmentation maps by grouping neighboring pixels and considering their pseudo labels collectively. With this contextual information, our method, named S4MC, increases the amount of unlabeled data used during training while maintaining the quality of the pseudo labels, all with negligible computational overhead. Through extensive experiments on standard benchmarks, we demonstrate that S4MC outperforms existing state-of-the-art semi-supervised learning approaches, offering a promising solution for reducing the cost of acquiring dense annotations. For example, S4MC achieves a 1.39 mIoU improvement over the prior art on PASCAL VOC 12 with 366 annotated images. The code to reproduce our experiments is available at https://s4mcontext.github.io/
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | COCO 1/512 labeled | Validation mIoU | 32.9 | S4MC |
| Semantic Segmentation | PASCAL VOC 2012 50% | Validation mIoU | 81.11 | S4MC |
| Semantic Segmentation | COCO 1/256 labeled | Validation mIoU | 40.4 | S4MC |
| Semantic Segmentation | Pascal VOC 2012 6.25% labeled | Validation mIoU | 78.84 | S4MC |
| Semantic Segmentation | PASCAL VOC 2012 92 labeled | Validation mIoU | 74.72 | S4MC |
| Semantic Segmentation | PASCAL VOC 2012 732 labeled | Validation mIoU | 80.12 | S4MC |
| Semantic Segmentation | PASCAL VOC 2012 732 labeled | Validation mIoU | 77.83 | S4MC (R50) |
| Semantic Segmentation | PASCAL VOC 2012 1464 labels | Validation mIoU | 81.56 | S4MC |
| Semantic Segmentation | PASCAL VOC 2012 1464 labels | Validation mIoU | 0.7941 | S4MC (R50) |
| Semantic Segmentation | PASCAL VOC 2012 25% labeled | Validation mIoU | 79.85 | S4MC |
| Semantic Segmentation | COCO 1/128 labeled | Validation mIoU | 43.78 | S4MC |
| Semantic Segmentation | COCO 1/64 labeled | Validation mIoU | 47.98 | S4MC |
| Semantic Segmentation | PASCAL VOC 2012 366 labeled | Validation mIoU | 79.09 | S4MC |
| Semantic Segmentation | Cityscapes 6.25% labeled | Validation mIoU | 77 | S4MC |
| Semantic Segmentation | COCO 1/32 labeled | Validation mIoU | 50.58 | S4MC |
| Semantic Segmentation | PASCAL VOC 2012 183 labeled | Validation mIoU | 75.21 | S4MC |
| 10-shot image generation | COCO 1/512 labeled | Validation mIoU | 32.9 | S4MC |
| 10-shot image generation | PASCAL VOC 2012 50% | Validation mIoU | 81.11 | S4MC |
| 10-shot image generation | COCO 1/256 labeled | Validation mIoU | 40.4 | S4MC |
| 10-shot image generation | Pascal VOC 2012 6.25% labeled | Validation mIoU | 78.84 | S4MC |
| 10-shot image generation | PASCAL VOC 2012 92 labeled | Validation mIoU | 74.72 | S4MC |
| 10-shot image generation | PASCAL VOC 2012 732 labeled | Validation mIoU | 80.12 | S4MC |
| 10-shot image generation | PASCAL VOC 2012 732 labeled | Validation mIoU | 77.83 | S4MC (R50) |
| 10-shot image generation | PASCAL VOC 2012 1464 labels | Validation mIoU | 81.56 | S4MC |
| 10-shot image generation | PASCAL VOC 2012 1464 labels | Validation mIoU | 0.7941 | S4MC (R50) |
| 10-shot image generation | PASCAL VOC 2012 25% labeled | Validation mIoU | 79.85 | S4MC |
| 10-shot image generation | COCO 1/128 labeled | Validation mIoU | 43.78 | S4MC |
| 10-shot image generation | COCO 1/64 labeled | Validation mIoU | 47.98 | S4MC |
| 10-shot image generation | PASCAL VOC 2012 366 labeled | Validation mIoU | 79.09 | S4MC |
| 10-shot image generation | Cityscapes 6.25% labeled | Validation mIoU | 77 | S4MC |
| 10-shot image generation | COCO 1/32 labeled | Validation mIoU | 50.58 | S4MC |
| 10-shot image generation | PASCAL VOC 2012 183 labeled | Validation mIoU | 75.21 | S4MC |