TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/DatUS^2: Data-driven Unsupervised Semantic Segmentation wi...

DatUS^2: Data-driven Unsupervised Semantic Segmentation with Pre-trained Self-supervised Vision Transformer

Sonal Kumar, Arijit Sur, Rashmi Dutta Baruah

2024-01-23Unsupervised Semantic SegmentationSegmentationSemantic Segmentation
PaperPDFCode(official)

Abstract

Successive proposals of several self-supervised training schemes continue to emerge, taking one step closer to developing a universal foundation model. In this process, the unsupervised downstream tasks are recognized as one of the evaluation methods to validate the quality of visual features learned with a self-supervised training scheme. However, unsupervised dense semantic segmentation has not been explored as a downstream task, which can utilize and evaluate the quality of semantic information introduced in patch-level feature representations during self-supervised training of a vision transformer. Therefore, this paper proposes a novel data-driven approach for unsupervised semantic segmentation (DatUS^2) as a downstream task. DatUS^2 generates semantically consistent and dense pseudo annotate segmentation masks for the unlabeled image dataset without using any visual-prior or synchronized data. We compare these pseudo-annotated segmentation masks with ground truth masks for evaluating recent self-supervised training schemes to learn shared semantic properties at the patch level and discriminative semantic properties at the segment level. Finally, we evaluate existing state-of-the-art self-supervised training schemes with our proposed downstream task, i.e., DatUS^2. Also, the best version of DatUS^2 outperforms the existing state-of-the-art method for the unsupervised dense semantic segmentation task with 15.02% MiOU and 21.47% Pixel accuracy on the SUIM dataset. It also achieves a competitive level of accuracy for a large-scale and complex dataset, i.e., the COCO dataset.

Results

TaskDatasetMetricValueModel
Semantic SegmentationSUIMPixel Accuracy69.98DatUS (ViT-B/8) + OC
Semantic SegmentationSUIMmIoU34.02DatUS (ViT-B/8) + OC
Semantic SegmentationSUIMPixel Accuracy64.67DatUS (ViT-B/8)
Semantic SegmentationSUIMmIoU28.48DatUS (ViT-B/8)
Unsupervised Semantic SegmentationSUIMPixel Accuracy69.98DatUS (ViT-B/8) + OC
Unsupervised Semantic SegmentationSUIMmIoU34.02DatUS (ViT-B/8) + OC
Unsupervised Semantic SegmentationSUIMPixel Accuracy64.67DatUS (ViT-B/8)
Unsupervised Semantic SegmentationSUIMmIoU28.48DatUS (ViT-B/8)
10-shot image generationSUIMPixel Accuracy69.98DatUS (ViT-B/8) + OC
10-shot image generationSUIMmIoU34.02DatUS (ViT-B/8) + OC
10-shot image generationSUIMPixel Accuracy64.67DatUS (ViT-B/8)
10-shot image generationSUIMmIoU28.48DatUS (ViT-B/8)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17