Weixuan Sun, Zheyuan Liu, Yanhao Zhang, Yiran Zhong, Nick Barnes
The Segment Anything Model (SAM) has demonstrated exceptional performance and versatility, making it a promising tool for various related tasks. In this report, we explore the application of SAM in Weakly-Supervised Semantic Segmentation (WSSS). Particularly, we adapt SAM as the pseudo-label generation pipeline given only the image-level class labels. While we observed impressive results in most cases, we also identify certain limitations. Our study includes performance evaluations on PASCAL VOC and MS-COCO, where we achieved remarkable improvements over the latest state-of-the-art methods on both datasets. We anticipate that this report encourages further explorations of adopting SAM in WSSS, as well as wider real-world applications.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Semantic Segmentation | COCO 2014 val | mIoU | 55.6 | WSSS-SAM(DeepLabV2-ResNet101) |
| Semantic Segmentation | PASCAL VOC 2012 val | Mean IoU | 77.2 | WSSS-SAM(ResNet-101, multi-stage) |
| Semantic Segmentation | PASCAL VOC 2012 test | Mean IoU | 77.1 | WSSS-SAM(DeepLabV2-ResNet101) |
| 10-shot image generation | COCO 2014 val | mIoU | 55.6 | WSSS-SAM(DeepLabV2-ResNet101) |
| 10-shot image generation | PASCAL VOC 2012 val | Mean IoU | 77.2 | WSSS-SAM(ResNet-101, multi-stage) |
| 10-shot image generation | PASCAL VOC 2012 test | Mean IoU | 77.1 | WSSS-SAM(DeepLabV2-ResNet101) |