TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/CNN in MRF: Video Object Segmentation via Inference in A C...

CNN in MRF: Video Object Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF

Linchao Bao, Baoyuan Wu, Wei Liu

2018-03-26CVPR 2018 6One-Shot SegmentationSemi-Supervised Video Object SegmentationOptical Flow EstimationSegmentationSemantic SegmentationVideo Object SegmentationVideo Semantic Segmentation
PaperPDF

Abstract

This paper addresses the problem of video object segmentation, where the initial object mask is given in the first frame of an input video. We propose a novel spatio-temporal Markov Random Field (MRF) model defined over pixels to handle this problem. Unlike conventional MRF models, the spatial dependencies among pixels in our model are encoded by a Convolutional Neural Network (CNN). Specifically, for a given object, the probability of a labeling to a set of spatially neighboring pixels can be predicted by a CNN trained for this specific object. As a result, higher-order, richer dependencies among pixels in the set can be implicitly modeled by the CNN. With temporal dependencies established by optical flow, the resulting MRF model combines both spatial and temporal cues for tackling video object segmentation. However, performing inference in the MRF model is very difficult due to the very high-order dependencies. To this end, we propose a novel CNN-embedded algorithm to perform approximate inference in the MRF. This algorithm proceeds by alternating between a temporal fusion step and a feed-forward CNN step. When initialized with an appearance-based one-shot segmentation CNN, our model outperforms the winning entries of the DAVIS 2017 Challenge, without resorting to model ensembling or any dedicated detectors.

Results

TaskDatasetMetricValueModel
VideoDAVIS 2017 (val)F-measure (Decay)26.2CINM
VideoDAVIS 2017 (val)F-measure (Mean)74CINM
VideoDAVIS 2017 (val)F-measure (Recall)81.6CINM
VideoDAVIS 2017 (val)J&F70.6CINM
VideoDAVIS 2017 (val)Jaccard (Decay)24.6CINM
VideoDAVIS 2017 (val)Jaccard (Mean)67.2CINM
VideoDAVIS 2017 (val)Jaccard (Recall)74.5CINM
VideoDAVIS 2016F-measure (Decay)14.7CINM
VideoDAVIS 2016F-measure (Mean)85CINM
VideoDAVIS 2016F-measure (Recall)92.1CINM
VideoDAVIS 2016J&F84.2CINM
VideoDAVIS 2016Jaccard (Decay)12.3CINM
VideoDAVIS 2016Jaccard (Mean)83.4CINM
VideoDAVIS 2016Jaccard (Recall)94.9CINM
VideoYouTubemIoU0.784MRFCNN
VideoDAVIS 2017 (test-dev)F-measure (Decay)20CINM
VideoDAVIS 2017 (test-dev)F-measure (Mean)70.5CINM
VideoDAVIS 2017 (test-dev)F-measure (Recall)79.6CINM
VideoDAVIS 2017 (test-dev)J&F67.5CINM
VideoDAVIS 2017 (test-dev)Jaccard (Decay)20CINM
VideoDAVIS 2017 (test-dev)Jaccard (Mean)64.5CINM
VideoDAVIS 2017 (test-dev)Jaccard (Recall)73.8CINM
Video Object SegmentationDAVIS 2017 (val)F-measure (Decay)26.2CINM
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)74CINM
Video Object SegmentationDAVIS 2017 (val)F-measure (Recall)81.6CINM
Video Object SegmentationDAVIS 2017 (val)J&F70.6CINM
Video Object SegmentationDAVIS 2017 (val)Jaccard (Decay)24.6CINM
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)67.2CINM
Video Object SegmentationDAVIS 2017 (val)Jaccard (Recall)74.5CINM
Video Object SegmentationDAVIS 2016F-measure (Decay)14.7CINM
Video Object SegmentationDAVIS 2016F-measure (Mean)85CINM
Video Object SegmentationDAVIS 2016F-measure (Recall)92.1CINM
Video Object SegmentationDAVIS 2016J&F84.2CINM
Video Object SegmentationDAVIS 2016Jaccard (Decay)12.3CINM
Video Object SegmentationDAVIS 2016Jaccard (Mean)83.4CINM
Video Object SegmentationDAVIS 2016Jaccard (Recall)94.9CINM
Video Object SegmentationYouTubemIoU0.784MRFCNN
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Decay)20CINM
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)70.5CINM
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Recall)79.6CINM
Video Object SegmentationDAVIS 2017 (test-dev)J&F67.5CINM
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Decay)20CINM
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)64.5CINM
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Recall)73.8CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Decay)26.2CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)74CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Recall)81.6CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F70.6CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Decay)24.6CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)67.2CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Recall)74.5CINM
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Decay)14.7CINM
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)85CINM
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Recall)92.1CINM
Semi-Supervised Video Object SegmentationDAVIS 2016J&F84.2CINM
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Decay)12.3CINM
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)83.4CINM
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Recall)94.9CINM
Semi-Supervised Video Object SegmentationYouTubemIoU0.784MRFCNN
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Decay)20CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)70.5CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Recall)79.6CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F67.5CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Decay)20CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)64.5CINM
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Recall)73.8CINM

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17