TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/A Generative Appearance Model for End-to-end Video Object ...

A Generative Appearance Model for End-to-end Video Object Segmentation

Joakim Johnander, Martin Danelljan, Emil Brissman, Fahad Shahbaz Khan, Michael Felsberg

2018-11-28CVPR 2019 6Semi-Supervised Video Object SegmentationOne-shot visual object segmentationSegmentationSemantic SegmentationVideo Object SegmentationVideo Semantic Segmentation
PaperPDFCode

Abstract

One of the fundamental challenges in video object segmentation is to find an effective representation of the target and background appearance. The best performing approaches resort to extensive fine-tuning of a convolutional neural network for this purpose. Besides being prohibitively expensive, this strategy cannot be truly trained end-to-end since the online fine-tuning procedure is not integrated into the offline training of the network. To address these issues, we propose a network architecture that learns a powerful representation of the target and background appearance in a single forward pass. The introduced appearance module learns a probabilistic generative model of target and background feature distributions. Given a new image, it predicts the posterior class probabilities, providing a highly discriminative cue, which is processed in later network modules. Both the learning and prediction stages of our appearance module are fully differentiable, enabling true end-to-end training of the entire segmentation pipeline. Comprehensive experiments demonstrate the effectiveness of the proposed approach on three video object segmentation benchmarks. We close the gap to approaches based on online fine-tuning on DAVIS17, while operating at 15 FPS on a single GPU. Furthermore, our method outperforms all published approaches on the large-scale YouTube-VOS dataset.

Results

TaskDatasetMetricValueModel
VideoDAVIS 2017 (val)F-measure (Decay)15.8AGAME
VideoDAVIS 2017 (val)F-measure (Mean)73.6AGAME
VideoDAVIS 2017 (val)F-measure (Recall)83.4AGAME
VideoDAVIS 2017 (val)J&F71.05AGAME
VideoDAVIS 2017 (val)Jaccard (Decay)14AGAME
VideoDAVIS 2017 (val)Jaccard (Mean)68.5AGAME
VideoDAVIS 2017 (val)Jaccard (Recall)78.4AGAME
VideoDAVIS 2016F-measure (Decay)9.8AGAME
VideoDAVIS 2016F-measure (Mean)82.2AGAME
VideoDAVIS 2016F-measure (Recall)90.3AGAME
VideoDAVIS 2016J&F81.85AGAME
VideoDAVIS 2016Jaccard (Decay)9.4AGAME
VideoDAVIS 2016Jaccard (Mean)81.5AGAME
VideoDAVIS 2016Jaccard (Recall)93.6AGAME
VideoDAVIS 2017 (test-dev)F-measure (Decay)27.6AGAME
VideoDAVIS 2017 (test-dev)F-measure (Mean)55.3AGAME
VideoDAVIS 2017 (test-dev)F-measure (Recall)61.1AGAME
VideoDAVIS 2017 (test-dev)J&F52.3AGAME
VideoDAVIS 2017 (test-dev)Jaccard (Decay)28.9AGAME
VideoDAVIS 2017 (test-dev)Jaccard (Mean)49.2AGAME
VideoDAVIS 2017 (test-dev)Jaccard (Recall)53.2AGAME
Video Object SegmentationDAVIS 2017 (val)F-measure (Decay)15.8AGAME
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)73.6AGAME
Video Object SegmentationDAVIS 2017 (val)F-measure (Recall)83.4AGAME
Video Object SegmentationDAVIS 2017 (val)J&F71.05AGAME
Video Object SegmentationDAVIS 2017 (val)Jaccard (Decay)14AGAME
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)68.5AGAME
Video Object SegmentationDAVIS 2017 (val)Jaccard (Recall)78.4AGAME
Video Object SegmentationDAVIS 2016F-measure (Decay)9.8AGAME
Video Object SegmentationDAVIS 2016F-measure (Mean)82.2AGAME
Video Object SegmentationDAVIS 2016F-measure (Recall)90.3AGAME
Video Object SegmentationDAVIS 2016J&F81.85AGAME
Video Object SegmentationDAVIS 2016Jaccard (Decay)9.4AGAME
Video Object SegmentationDAVIS 2016Jaccard (Mean)81.5AGAME
Video Object SegmentationDAVIS 2016Jaccard (Recall)93.6AGAME
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Decay)27.6AGAME
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)55.3AGAME
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Recall)61.1AGAME
Video Object SegmentationDAVIS 2017 (test-dev)J&F52.3AGAME
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Decay)28.9AGAME
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)49.2AGAME
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Recall)53.2AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Decay)15.8AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)73.6AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Recall)83.4AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F71.05AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Decay)14AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)68.5AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Recall)78.4AGAME
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Decay)9.8AGAME
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)82.2AGAME
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Recall)90.3AGAME
Semi-Supervised Video Object SegmentationDAVIS 2016J&F81.85AGAME
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Decay)9.4AGAME
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)81.5AGAME
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Recall)93.6AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Decay)27.6AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)55.3AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Recall)61.1AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F52.3AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Decay)28.9AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)49.2AGAME
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Recall)53.2AGAME

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17