TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/RVOS: End-to-End Recurrent Network for Video Object Segmen...

RVOS: End-to-End Recurrent Network for Video Object Segmentation

Carles Ventura, Miriam Bellver, Andreu Girbau, Amaia Salvador, Ferran Marques, Xavier Giro-i-Nieto

2019-03-13CVPR 2019 6Unsupervised Video Object SegmentationSemi-Supervised Video Object SegmentationOne-shot visual object segmentationSegmentationVideo Object Segmentation
PaperPDFCode(official)

Abstract

Multiple object video object segmentation is a challenging task, specially for the zero-shot case, when no object mask is given at the initial frame and the model has to find the objects to be segmented along the sequence. In our work, we propose a Recurrent network for multiple object Video Object Segmentation (RVOS) that is fully end-to-end trainable. Our model incorporates recurrence on two different domains: (i) the spatial, which allows to discover the different object instances within a frame, and (ii) the temporal, which allows to keep the coherence of the segmented objects along time. We train RVOS for zero-shot video object segmentation and are the first ones to report quantitative results for DAVIS-2017 and YouTube-VOS benchmarks. Further, we adapt RVOS for one-shot video object segmentation by using the masks obtained in previous time steps as inputs to be processed by the recurrent module. Our model reaches comparable results to state-of-the-art techniques in YouTube-VOS benchmark and outperforms all previous video object segmentation methods not using online learning in the DAVIS-2017 benchmark. Moreover, our model achieves faster inference runtimes than previous methods, reaching 44ms/frame on a P100 GPU.

Results

TaskDatasetMetricValueModel
VideoDAVIS 2017 (val)F-measure (Decay)28.2RVOS
VideoDAVIS 2017 (val)F-measure (Mean)63.6RVOS
VideoDAVIS 2017 (val)F-measure (Recall)73.2RVOS
VideoDAVIS 2017 (val)J&F60.55RVOS
VideoDAVIS 2017 (val)Jaccard (Decay)24.9RVOS
VideoDAVIS 2017 (val)Jaccard (Mean)57.5RVOS
VideoDAVIS 2017 (val)Jaccard (Recall)65.2RVOS
VideoDAVIS 2017 (test-dev)F-measure (Decay)36.7RVOS
VideoDAVIS 2017 (test-dev)F-measure (Mean)52.6RVOS
VideoDAVIS 2017 (test-dev)F-measure (Recall)61.7RVOS
VideoDAVIS 2017 (test-dev)J&F50.3RVOS
VideoDAVIS 2017 (test-dev)Jaccard (Decay)35.7RVOS
VideoDAVIS 2017 (test-dev)Jaccard (Mean)47.9RVOS
VideoDAVIS 2017 (test-dev)Jaccard (Recall)54.4RVOS
VideoYouTube-VOS 2018F-Measure (Seen)67.2RVOS
VideoYouTube-VOS 2018F-Measure (Unseen)51RVOS
VideoYouTube-VOS 2018Jaccard (Seen)63.6RVOS
VideoYouTube-VOS 2018Overall56.8RVOS
VideoYouTube-VOS 2018Speed (FPS)45.5RVOS
VideoYouTube-VOSF-Measure (Seen)67.2RVOS-Mask-ST+
VideoYouTube-VOSF-Measure (Unseen)51RVOS-Mask-ST+
VideoYouTube-VOSJaccard (Seen)63.6RVOS-Mask-ST+
VideoYouTube-VOSJaccard (Unseen)45.5RVOS-Mask-ST+
VideoDAVIS 2017 (test-dev)F-measure (Decay)1.8RVOS
VideoDAVIS 2017 (test-dev)F-measure (Mean)27.3RVOS
VideoDAVIS 2017 (test-dev)F-measure (Recall)24.8RVOS
VideoDAVIS 2017 (test-dev)J&F22.5RVOS
VideoDAVIS 2017 (test-dev)Jaccard (Decay)1.6RVOS
VideoDAVIS 2017 (test-dev)Jaccard (Mean)17.7RVOS
VideoDAVIS 2017 (test-dev)Jaccard (Recall)16.2RVOS
VideoDAVIS 2017 (val)F-measure (Mean)45.7RVOS
VideoDAVIS 2017 (val)F-measure (Recall)46.4RVOS
VideoDAVIS 2017 (val)J&F41.2RVOS
VideoDAVIS 2017 (val)Jaccard (Mean)36.8RVOS
VideoDAVIS 2017 (val)Jaccard (Recall)40.2RVOS
Video Object SegmentationDAVIS 2017 (val)F-measure (Decay)28.2RVOS
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)63.6RVOS
Video Object SegmentationDAVIS 2017 (val)F-measure (Recall)73.2RVOS
Video Object SegmentationDAVIS 2017 (val)J&F60.55RVOS
Video Object SegmentationDAVIS 2017 (val)Jaccard (Decay)24.9RVOS
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)57.5RVOS
Video Object SegmentationDAVIS 2017 (val)Jaccard (Recall)65.2RVOS
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Decay)36.7RVOS
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)52.6RVOS
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Recall)61.7RVOS
Video Object SegmentationDAVIS 2017 (test-dev)J&F50.3RVOS
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Decay)35.7RVOS
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)47.9RVOS
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Recall)54.4RVOS
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)67.2RVOS
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)51RVOS
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)63.6RVOS
Video Object SegmentationYouTube-VOS 2018Overall56.8RVOS
Video Object SegmentationYouTube-VOS 2018Speed (FPS)45.5RVOS
Video Object SegmentationYouTube-VOSF-Measure (Seen)67.2RVOS-Mask-ST+
Video Object SegmentationYouTube-VOSF-Measure (Unseen)51RVOS-Mask-ST+
Video Object SegmentationYouTube-VOSJaccard (Seen)63.6RVOS-Mask-ST+
Video Object SegmentationYouTube-VOSJaccard (Unseen)45.5RVOS-Mask-ST+
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Decay)1.8RVOS
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)27.3RVOS
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Recall)24.8RVOS
Video Object SegmentationDAVIS 2017 (test-dev)J&F22.5RVOS
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Decay)1.6RVOS
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)17.7RVOS
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Recall)16.2RVOS
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)45.7RVOS
Video Object SegmentationDAVIS 2017 (val)F-measure (Recall)46.4RVOS
Video Object SegmentationDAVIS 2017 (val)J&F41.2RVOS
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)36.8RVOS
Video Object SegmentationDAVIS 2017 (val)Jaccard (Recall)40.2RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Decay)28.2RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)63.6RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Recall)73.2RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F60.55RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Decay)24.9RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)57.5RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Recall)65.2RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Decay)36.7RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)52.6RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Recall)61.7RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F50.3RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Decay)35.7RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)47.9RVOS
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Recall)54.4RVOS
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)67.2RVOS
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)51RVOS
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)63.6RVOS
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall56.8RVOS
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)45.5RVOS
Semi-Supervised Video Object SegmentationYouTube-VOSF-Measure (Seen)67.2RVOS-Mask-ST+
Semi-Supervised Video Object SegmentationYouTube-VOSF-Measure (Unseen)51RVOS-Mask-ST+
Semi-Supervised Video Object SegmentationYouTube-VOSJaccard (Seen)63.6RVOS-Mask-ST+
Semi-Supervised Video Object SegmentationYouTube-VOSJaccard (Unseen)45.5RVOS-Mask-ST+

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17