TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Efficient Regional Memory Network for Video Object Segment...

Efficient Regional Memory Network for Video Object Segmentation

Haozhe Xie, Hongxun Yao, Shangchen Zhou, Shengping Zhang, Wenxiu Sun

2021-03-24CVPR 2021 1Semi-Supervised Video Object SegmentationOptical Flow EstimationOne-shot visual object segmentationSemantic SegmentationVideo Object SegmentationVideo Semantic Segmentation
PaperPDFCode(official)

Abstract

Recently, several Space-Time Memory based networks have shown that the object cues (e.g. video frames as well as the segmented object masks) from the past frames are useful for segmenting objects in the current frame. However, these methods exploit the information from the memory by global-to-global matching between the current and past frames, which lead to mismatching to similar objects and high computational complexity. To address these problems, we propose a novel local-to-local matching solution for semi-supervised VOS, namely Regional Memory Network (RMNet). In RMNet, the precise regional memory is constructed by memorizing local regions where the target objects appear in the past frames. For the current query frame, the query regions are tracked and predicted based on the optical flow estimated from the previous frame. The proposed local-to-local matching effectively alleviates the ambiguity of similar objects in both memory and query frames, which allows the information to be passed from the regional memory to the query region efficiently and effectively. Experimental results indicate that the proposed RMNet performs favorably against state-of-the-art methods on the DAVIS and YouTube-VOS datasets.

Results

TaskDatasetMetricValueModel
VideoDAVIS 2017 (val)F-measure (Mean)86RMNet
VideoDAVIS 2017 (val)J&F83.5RMNet
VideoDAVIS 2017 (val)Jaccard (Mean)81RMNet
VideoDAVIS 2016F-measure (Mean)88.7RMNet
VideoDAVIS 2016J&F88.8RMNet
VideoDAVIS 2016Jaccard (Mean)88.9RMNet
VideoDAVIS 2017 (test-dev)F-measure (Mean)78.1RMNet
VideoDAVIS 2017 (test-dev)J&F75RMNet
VideoDAVIS 2017 (test-dev)Jaccard (Mean)71.9RMNet
VideoDAVIS (no YouTube-VOS training)D16 val (F)82.3RMNet
VideoDAVIS (no YouTube-VOS training)D16 val (G)81.5RMNet
VideoDAVIS (no YouTube-VOS training)D16 val (J)80.6RMNet
VideoDAVIS (no YouTube-VOS training)D17 val (F)77.2RMNet
VideoDAVIS (no YouTube-VOS training)D17 val (G)75RMNet
VideoDAVIS (no YouTube-VOS training)D17 val (J)72.8RMNet
VideoDAVIS (no YouTube-VOS training)FPS11.9RMNet
VideoYouTube-VOS 2018F-Measure (Seen)85.7RMNet
VideoYouTube-VOS 2018F-Measure (Unseen)82.4RMNet
VideoYouTube-VOS 2018Jaccard (Seen)82.1RMNet
VideoYouTube-VOS 2018Jaccard (Unseen)75.7RMNet
VideoYouTube-VOS 2018Overall81.5RMNet
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)86RMNet
Video Object SegmentationDAVIS 2017 (val)J&F83.5RMNet
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)81RMNet
Video Object SegmentationDAVIS 2016F-measure (Mean)88.7RMNet
Video Object SegmentationDAVIS 2016J&F88.8RMNet
Video Object SegmentationDAVIS 2016Jaccard (Mean)88.9RMNet
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)78.1RMNet
Video Object SegmentationDAVIS 2017 (test-dev)J&F75RMNet
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)71.9RMNet
Video Object SegmentationDAVIS (no YouTube-VOS training)D16 val (F)82.3RMNet
Video Object SegmentationDAVIS (no YouTube-VOS training)D16 val (G)81.5RMNet
Video Object SegmentationDAVIS (no YouTube-VOS training)D16 val (J)80.6RMNet
Video Object SegmentationDAVIS (no YouTube-VOS training)D17 val (F)77.2RMNet
Video Object SegmentationDAVIS (no YouTube-VOS training)D17 val (G)75RMNet
Video Object SegmentationDAVIS (no YouTube-VOS training)D17 val (J)72.8RMNet
Video Object SegmentationDAVIS (no YouTube-VOS training)FPS11.9RMNet
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)85.7RMNet
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)82.4RMNet
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)82.1RMNet
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)75.7RMNet
Video Object SegmentationYouTube-VOS 2018Overall81.5RMNet
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)86RMNet
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F83.5RMNet
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)81RMNet
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)88.7RMNet
Semi-Supervised Video Object SegmentationDAVIS 2016J&F88.8RMNet
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)88.9RMNet
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)78.1RMNet
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F75RMNet
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)71.9RMNet
Semi-Supervised Video Object SegmentationDAVIS (no YouTube-VOS training)D16 val (F)82.3RMNet
Semi-Supervised Video Object SegmentationDAVIS (no YouTube-VOS training)D16 val (G)81.5RMNet
Semi-Supervised Video Object SegmentationDAVIS (no YouTube-VOS training)D16 val (J)80.6RMNet
Semi-Supervised Video Object SegmentationDAVIS (no YouTube-VOS training)D17 val (F)77.2RMNet
Semi-Supervised Video Object SegmentationDAVIS (no YouTube-VOS training)D17 val (G)75RMNet
Semi-Supervised Video Object SegmentationDAVIS (no YouTube-VOS training)D17 val (J)72.8RMNet
Semi-Supervised Video Object SegmentationDAVIS (no YouTube-VOS training)FPS11.9RMNet
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)85.7RMNet
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)82.4RMNet
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)82.1RMNet
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)75.7RMNet
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall81.5RMNet

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Channel-wise Motion Features for Efficient Motion Segmentation2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17SAMST: A Transformer framework based on SAM pseudo label filtering for remote sensing semi-supervised semantic segmentation2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15