TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Motion-inductive Self-supervised Object Discovery in Videos

Motion-inductive Self-supervised Object Discovery in Videos

Shuangrui Ding, Weidi Xie, Yabo Chen, Rui Qian, Xiaopeng Zhang, Hongkai Xiong, Qi Tian

2022-10-01Optical Flow EstimationObject DiscoveryVideo SegmentationVideo Semantic SegmentationUnsupervised Object Segmentation
PaperPDF

Abstract

In this paper, we consider the task of unsupervised object discovery in videos. Previous works have shown promising results via processing optical flows to segment objects. However, taking flow as input brings about two drawbacks. First, flow cannot capture sufficient cues when objects remain static or partially occluded. Second, it is challenging to establish temporal coherency from flow-only input, due to the missing texture information. To tackle these limitations, we propose a model for directly processing consecutive RGB frames, and infer the optical flow between any pair of frames using a layered representation, with the opacity channels being treated as the segmentation. Additionally, to enforce object permanence, we apply temporal consistency loss on the inferred masks from randomly-paired frames, which refer to the motions at different paces, and encourage the model to segment the objects even if they may not move at the current time point. Experimentally, we demonstrate superior performance over previous state-of-the-art methods on three public video segmentation datasets (DAVIS2016, SegTrackv2, and FBMS-59), while being computationally efficient by avoiding the overhead of computing optical flow as input.

Results

TaskDatasetMetricValueModel
Instance SegmentationSegTrack-v2mIoU62.2MOD
Instance SegmentationFBMS-59mIoU61.3MOD
Instance SegmentationDAVIS 2016J score73.9MOD
Unsupervised Object SegmentationSegTrack-v2mIoU62.2MOD
Unsupervised Object SegmentationFBMS-59mIoU61.3MOD
Unsupervised Object SegmentationDAVIS 2016J score73.9MOD

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17Memory-Augmented SAM2 for Training-Free Surgical Video Segmentation2025-07-13An Efficient Approach for Muscle Segmentation and 3D Reconstruction Using Keypoint Tracking in MRI Scan2025-07-11MUVOD: A Novel Multi-view Video Object Segmentation Dataset and A Benchmark for 3D Segmentation2025-07-10Learning to Track Any Points from Human Motion2025-07-08TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation2025-07-07When Does Pruning Benefit Vision Representations?2025-07-02