Gyungin Shin, Samuel Albanie, Weidi Xie
In this paper, we tackle the challenging task of unsupervised salient object detection (SOD) by leveraging spectral clustering on self-supervised features. We make the following contributions: (i) We revisit spectral clustering and demonstrate its potential to group the pixels of salient objects; (ii) Given mask proposals from multiple applications of spectral clustering on image features computed from various self-supervised models, e.g., MoCov2, SwAV, DINO, we propose a simple but effective winner-takes-all voting mechanism for selecting the salient masks, leveraging object priors based on framing and distinctiveness; (iii) Using the selected object segmentation as pseudo groundtruth masks, we train a salient object detector, dubbed SelfMask, which outperforms prior approaches on three unsupervised SOD benchmarks. Code is publicly available at https://github.com/NoelShin/selfmask.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Saliency Detection | ECSSD | Accuracy | 95.5 | SelfMask |
| Saliency Detection | ECSSD | IoU | 81.8 | SelfMask |
| Saliency Detection | ECSSD | maximal F-measure | 95.6 | SelfMask |
| Saliency Detection | DUT-OMRON | Accuracy | 91.9 | SelfMask |
| Saliency Detection | DUT-OMRON | IoU | 65.5 | SelfMask |
| Saliency Detection | DUT-OMRON | maximal F-measure | 85.2 | SelfMask |
| Saliency Detection | DUTS | Accuracy | 93.3 | SelfMask |
| Saliency Detection | DUTS | IoU | 66 | SelfMask |
| Saliency Detection | DUTS | maximal F-measure | 88.2 | SelfMask |