TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Learning Video Object Segmentation from Unlabeled Videos

Learning Video Object Segmentation from Unlabeled Videos

Xiankai Lu, Wenguan Wang, Jianbing Shen, Yu-Wing Tai, David Crandall, Steven C. H. Hoi

2020-03-10CVPR 2020 6Unsupervised Video Object SegmentationSemi-Supervised Video Object SegmentationRepresentation LearningSegmentationSemantic SegmentationVideo Object SegmentationVideo Semantic Segmentation
PaperPDFCode(official)

Abstract

We propose a new method for video object segmentation (VOS) that addresses object pattern learning from unlabeled videos, unlike most existing methods which rely heavily on extensive annotated data. We introduce a unified unsupervised/weakly supervised learning framework, called MuG, that comprehensively captures intrinsic properties of VOS at multiple granularities. Our approach can help advance understanding of visual patterns in VOS and significantly reduce annotation burden. With a carefully-designed architecture and strong representation learning ability, our learned model can be applied to diverse VOS settings, including object-level zero-shot VOS, instance-level zero-shot VOS, and one-shot VOS. Experiments demonstrate promising performance in these settings, as well as the potential of MuG in leveraging unlabeled data to further improve the segmentation accuracy.

Results

TaskDatasetMetricValueModel
VideoDAVIS 2017 (val)F-measure (Decay)37.4MuG-W
VideoDAVIS 2017 (val)F-measure (Mean)58MuG-W
VideoDAVIS 2017 (val)F-measure (Recall)62.2MuG-W
VideoDAVIS 2017 (val)J&F56.05MuG-W
VideoDAVIS 2017 (val)Jaccard (Decay)32.5MuG-W
VideoDAVIS 2017 (val)Jaccard (Mean)54.1MuG-W
VideoDAVIS 2017 (val)Jaccard (Recall)60.5MuG-W
VideoDAVIS 2016F-measure (Decay)27.2MuG-W
VideoDAVIS 2016F-measure (Mean)63.6MuG-W
VideoDAVIS 2016F-measure (Recall)67.7MuG-W
VideoDAVIS 2016J&F64.65MuG-W
VideoDAVIS 2016Jaccard (Decay)26.4MuG-W
VideoDAVIS 2016Jaccard (Mean)65.7MuG-W
VideoDAVIS 2016Jaccard (Recall)77.7MuG-W
VideoDAVIS 2017 (test-dev)F-measure (Decay)-1.7MuG-W
VideoDAVIS 2017 (test-dev)F-measure (Mean)44.5MuG-W
VideoDAVIS 2017 (test-dev)F-measure (Recall)46.6MuG-W
VideoDAVIS 2017 (test-dev)J&F41.7MuG-W
VideoDAVIS 2017 (test-dev)Jaccard (Decay)-2.7MuG-W
VideoDAVIS 2017 (test-dev)Jaccard (Mean)38.9MuG-W
VideoDAVIS 2017 (test-dev)Jaccard (Recall)44.3MuG-W
Video Object SegmentationDAVIS 2017 (val)F-measure (Decay)37.4MuG-W
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)58MuG-W
Video Object SegmentationDAVIS 2017 (val)F-measure (Recall)62.2MuG-W
Video Object SegmentationDAVIS 2017 (val)J&F56.05MuG-W
Video Object SegmentationDAVIS 2017 (val)Jaccard (Decay)32.5MuG-W
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)54.1MuG-W
Video Object SegmentationDAVIS 2017 (val)Jaccard (Recall)60.5MuG-W
Video Object SegmentationDAVIS 2016F-measure (Decay)27.2MuG-W
Video Object SegmentationDAVIS 2016F-measure (Mean)63.6MuG-W
Video Object SegmentationDAVIS 2016F-measure (Recall)67.7MuG-W
Video Object SegmentationDAVIS 2016J&F64.65MuG-W
Video Object SegmentationDAVIS 2016Jaccard (Decay)26.4MuG-W
Video Object SegmentationDAVIS 2016Jaccard (Mean)65.7MuG-W
Video Object SegmentationDAVIS 2016Jaccard (Recall)77.7MuG-W
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Decay)-1.7MuG-W
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)44.5MuG-W
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Recall)46.6MuG-W
Video Object SegmentationDAVIS 2017 (test-dev)J&F41.7MuG-W
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Decay)-2.7MuG-W
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)38.9MuG-W
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Recall)44.3MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Decay)37.4MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)58MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Recall)62.2MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F56.05MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Decay)32.5MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)54.1MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Recall)60.5MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Decay)27.2MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)63.6MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Recall)67.7MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2016J&F64.65MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Decay)26.4MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)65.7MuG-W
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Recall)77.7MuG-W

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper2025-07-20Spectral Bellman Method: Unifying Representation and Exploration in RL2025-07-17Boosting Team Modeling through Tempo-Relational Representation Learning2025-07-17Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17