TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/XMem: Long-Term Video Object Segmentation with an Atkinson...

XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

Ho Kei Cheng, Alexander G. Schwing

2022-07-14Semi-Supervised Video Object Segmentation2D Human Pose Estimation3D Absolute Human Pose EstimationSegmentationSemantic SegmentationVideo Object SegmentationVideo Semantic Segmentation2D Object Detection
PaperPDFCodeCode(official)

Abstract

We present XMem, a video object segmentation architecture for long videos with unified feature memory stores inspired by the Atkinson-Shiffrin memory model. Prior work on video object segmentation typically only uses one type of feature memory. For videos longer than a minute, a single feature memory model tightly links memory consumption and accuracy. In contrast, following the Atkinson-Shiffrin model, we develop an architecture that incorporates multiple independent yet deeply-connected feature memory stores: a rapidly updated sensory memory, a high-resolution working memory, and a compact thus sustained long-term memory. Crucially, we develop a memory potentiation algorithm that routinely consolidates actively used working memory elements into the long-term memory, which avoids memory explosion and minimizes performance decay for long-term prediction. Combined with a new memory reading mechanism, XMem greatly exceeds state-of-the-art performance on long-video datasets while being on par with state-of-the-art methods (that do not work on long videos) on short-video datasets. Code is available at https://hkchengrex.github.io/XMem

Results

TaskDatasetMetricValueModel
VideoDAVIS-2017 (test-dev)F-measure87XMem (BL30K, MS)
VideoDAVIS-2017 (test-dev)Jaccard80.5XMem (BL30K, MS)
VideoDAVIS-2017 (test-dev)Mean Jaccard & F-Measure83.7XMem (BL30K, MS)
VideoDAVIS-2017 (test-dev)F-measure84.5XMem
VideoDAVIS-2017 (test-dev)Jaccard77.4XMem
VideoDAVIS-2017 (test-dev)Mean Jaccard & F-Measure81XMem
VideoYouTube-VOS 2019F-Measure (Seen)89.8XMem (BL30K,MS)
VideoYouTube-VOS 2019F-Measure (Unseen)89.9XMem (BL30K,MS)
VideoYouTube-VOS 2019Jaccard (Seen)85.5XMem (BL30K,MS)
VideoYouTube-VOS 2019Jaccard (Unseen)81.8XMem (BL30K,MS)
VideoYouTube-VOS 2019Mean Jaccard & F-Measure86.8XMem (BL30K,MS)
VideoYouTube-VOS 2019F-Measure (Seen)88.6XMem
VideoYouTube-VOS 2019F-Measure (Unseen)88.6XMem
VideoYouTube-VOS 2019Jaccard (Seen)84.3XMem
VideoYouTube-VOS 2019Jaccard (Unseen)80.3XMem
VideoYouTube-VOS 2019Mean Jaccard & F-Measure85.5XMem
VideoM$^3$-VOSAverage IOU70.4XMem
VideoDAVIS 2016F-Score94.4XMem (BL30K, MS)
VideoDAVIS 2016J&F93.3XMem (BL30K, MS)
VideoDAVIS 2016Jaccard (Mean)92.2XMem (BL30K, MS)
VideoDAVIS 2016F-Score92.7XMem
VideoDAVIS 2016J&F91.5XMem
VideoDAVIS 2016Jaccard (Mean)90.4XMem
VideoYouTube-VOS 2018F-Measure (Seen)90.3XMem (BL30K, MS)
VideoYouTube-VOS 2018F-Measure (Unseen)90.2XMem (BL30K, MS)
VideoYouTube-VOS 2018Jaccard (Seen)85.6XMem (BL30K, MS)
VideoYouTube-VOS 2018Jaccard (Unseen)81.7XMem (BL30K, MS)
VideoYouTube-VOS 2018Mean Jaccard & F-Measure86.9XMem (BL30K, MS)
VideoDAVIS 2017 (val)F-measure92.6XMem (BLK30K, MS)
VideoDAVIS 2017 (val)Jaccard86.3XMem (BLK30K, MS)
VideoDAVIS 2017 (val)Mean Jaccard & F-Measure89.5XMem (BLK30K, MS)
VideoDAVIS 2017 (val)F-measure89.5XMem
VideoDAVIS 2017 (val)Jaccard82.9XMem
VideoDAVIS 2017 (val)Mean Jaccard & F-Measure86.2XMem
VideoMOSEF62XMem
VideoMOSEJ53.3XMem
VideoMOSEJ&F57.6XMem
VideoDAVIS 2017 (val)F-measure (Mean)92.6XMem (BL30K, MS)
VideoDAVIS 2017 (val)J&F89.5XMem (BL30K, MS)
VideoDAVIS 2017 (val)Jaccard (Mean)86.3XMem (BL30K, MS)
VideoDAVIS 2017 (val)F-measure (Mean)91XMem (MS)
VideoDAVIS 2017 (val)J&F88.2XMem (MS)
VideoDAVIS 2017 (val)Jaccard (Mean)85.4XMem (MS)
VideoDAVIS 2017 (val)F-measure (Mean)91.4XMem (BL30K)
VideoDAVIS 2017 (val)J&F87.7XMem (BL30K)
VideoDAVIS 2017 (val)Jaccard (Mean)84XMem (BL30K)
VideoDAVIS 2017 (val)Speed (FPS)22.6XMem (BL30K)
VideoDAVIS 2017 (val)F-measure (Mean)89.5XMem
VideoDAVIS 2017 (val)J&F86.2XMem
VideoDAVIS 2017 (val)Jaccard (Mean)82.9XMem
VideoDAVIS 2017 (val)Speed (FPS)22.6XMem
VideoDAVIS 2017 (val)F-measure (Mean)87.6XMem (DAVIS and YouTubeVOS only)
VideoDAVIS 2017 (val)J&F84.5XMem (DAVIS and YouTubeVOS only)
VideoDAVIS 2017 (val)Jaccard (Mean)81.4XMem (DAVIS and YouTubeVOS only)
VideoDAVIS 2017 (val)Speed (FPS)22.6XMem (DAVIS and YouTubeVOS only)
VideoDAVIS 2017 (val)F-measure (Mean)79.3XMem (DAVIS only)
VideoDAVIS 2017 (val)J&F76.7XMem (DAVIS only)
VideoDAVIS 2017 (val)Jaccard (Mean)74.1XMem (DAVIS only)
VideoDAVIS 2017 (val)Speed (FPS)22.6XMem (DAVIS only)
VideoDAVIS 2016F-measure (Mean)94.4XMem (BL30K, MS)
VideoDAVIS 2016J&F93.3XMem (BL30K, MS)
VideoDAVIS 2016Jaccard (Mean)92.2XMem (BL30K, MS)
VideoDAVIS 2016F-measure (Mean)93.5XMem (MS)
VideoDAVIS 2016J&F92.7XMem (MS)
VideoDAVIS 2016Jaccard (Mean)92XMem (MS)
VideoDAVIS 2016F-measure (Mean)93.2XMem (BL30K)
VideoDAVIS 2016J&F92XMem (BL30K)
VideoDAVIS 2016Jaccard (Mean)90.7XMem (BL30K)
VideoDAVIS 2016Speed (FPS)29.6XMem (BL30K)
VideoDAVIS 2016F-measure (Mean)92.7XMem
VideoDAVIS 2016J&F91.5XMem
VideoDAVIS 2016Jaccard (Mean)90.4XMem
VideoDAVIS 2016Speed (FPS)29.6XMem
VideoDAVIS 2016F-measure (Mean)91.9XMem (DAVIS+YouTubeVOS only)
VideoDAVIS 2016J&F90.8XMem (DAVIS+YouTubeVOS only)
VideoDAVIS 2016Jaccard (Mean)89.6XMem (DAVIS+YouTubeVOS only)
VideoDAVIS 2016Speed (FPS)29.6XMem (DAVIS+YouTubeVOS only)
VideoDAVIS 2016F-measure (Mean)88.9XMem (DAVIS only)
VideoDAVIS 2016J&F87.8XMem (DAVIS only)
VideoDAVIS 2016Jaccard (Mean)86.7XMem (DAVIS only)
VideoDAVIS 2016Speed (FPS)29.6XMem (DAVIS only)
VideoYouTube-VOS 2019F-Measure (Seen)89.8XMem (BL30K, MS)
VideoYouTube-VOS 2019F-Measure (Unseen)89.9XMem (BL30K, MS)
VideoYouTube-VOS 2019Jaccard (Seen)85.5XMem (BL30K, MS)
VideoYouTube-VOS 2019Jaccard (Unseen)81.8XMem (BL30K, MS)
VideoYouTube-VOS 2019Overall86.8XMem (BL30K, MS)
VideoYouTube-VOS 2019F-Measure (Seen)89.2XMem (MS)
VideoYouTube-VOS 2019F-Measure (Unseen)89.8XMem (MS)
VideoYouTube-VOS 2019Jaccard (Seen)84.9XMem (MS)
VideoYouTube-VOS 2019Jaccard (Unseen)81.8XMem (MS)
VideoYouTube-VOS 2019Overall86.4XMem (MS)
VideoYouTube-VOS 2019F-Measure (Seen)89.2XMem (BL30K)
VideoYouTube-VOS 2019F-Measure (Unseen)88.8XMem (BL30K)
VideoYouTube-VOS 2019Jaccard (Seen)84.8XMem (BL30K)
VideoYouTube-VOS 2019Jaccard (Unseen)80.3XMem (BL30K)
VideoYouTube-VOS 2019Overall85.8XMem (BL30K)
VideoYouTube-VOS 2019F-Measure (Seen)88XMem
VideoYouTube-VOS 2019F-Measure (Unseen)87.1XMem
VideoYouTube-VOS 2019Jaccard (Seen)83.6XMem
VideoYouTube-VOS 2019Jaccard (Unseen)78.5XMem
VideoYouTube-VOS 2019Overall84.3XMem
VideoDAVIS 2017 (test-dev)F-measure (Mean)87XMem (BL30K, MS)
VideoDAVIS 2017 (test-dev)J&F83.7XMem (BL30K, MS)
VideoDAVIS 2017 (test-dev)Jaccard (Mean)80.5XMem (BL30K, MS)
VideoDAVIS 2017 (test-dev)F-measure (Mean)86.4XMem (MS)
VideoDAVIS 2017 (test-dev)J&F83.1XMem (MS)
VideoDAVIS 2017 (test-dev)Jaccard (Mean)79.7XMem (MS)
VideoDAVIS 2017 (test-dev)F-measure (Mean)85.8XMem (BL30K, 600p)
VideoDAVIS 2017 (test-dev)J&F82.5XMem (BL30K, 600p)
VideoDAVIS 2017 (test-dev)Jaccard (Mean)79.1XMem (BL30K, 600p)
VideoDAVIS 2017 (test-dev)F-measure (Mean)84.7XMem (BL30K)
VideoDAVIS 2017 (test-dev)J&F81.2XMem (BL30K)
VideoDAVIS 2017 (test-dev)Jaccard (Mean)77.6XMem (BL30K)
VideoDAVIS 2017 (test-dev)F-measure (Mean)84.5XMem
VideoDAVIS 2017 (test-dev)J&F81XMem
VideoDAVIS 2017 (test-dev)Jaccard (Mean)77.4XMem
VideoDAVIS 2017 (test-dev)F-measure (Mean)83.4XMem (DAVIS and YouTubeVOS only)
VideoDAVIS 2017 (test-dev)J&F79.8XMem (DAVIS and YouTubeVOS only)
VideoDAVIS 2017 (test-dev)Jaccard (Mean)76.3XMem (DAVIS and YouTubeVOS only)
VideoDAVIS (no YouTube-VOS training)FPS29.6XMem
VideoYouTube-VOS 2018F-Measure (Seen)90.3XMem (BL30K, MS)
VideoYouTube-VOS 2018F-Measure (Unseen)90.2XMem (BL30K, MS)
VideoYouTube-VOS 2018Jaccard (Seen)85.6XMem (BL30K, MS)
VideoYouTube-VOS 2018Jaccard (Unseen)81.7XMem (BL30K, MS)
VideoYouTube-VOS 2018Overall86.9XMem (BL30K, MS)
VideoYouTube-VOS 2018F-Measure (Seen)89.9XMem (MS)
VideoYouTube-VOS 2018F-Measure (Unseen)89.9XMem (MS)
VideoYouTube-VOS 2018Jaccard (Seen)85.3XMem (MS)
VideoYouTube-VOS 2018Jaccard (Unseen)81.7XMem (MS)
VideoYouTube-VOS 2018Overall86.7XMem (MS)
VideoYouTube-VOS 2018F-Measure (Seen)89.8XMem (BL30K)
VideoYouTube-VOS 2018F-Measure (Unseen)89.2XMem (BL30K)
VideoYouTube-VOS 2018Jaccard (Seen)85.1XMem (BL30K)
VideoYouTube-VOS 2018Jaccard (Unseen)80.3XMem (BL30K)
VideoYouTube-VOS 2018Overall86.1XMem (BL30K)
VideoYouTube-VOS 2018Speed (FPS)22.6XMem (BL30K)
VideoYouTube-VOS 2018F-Measure (Seen)89.3XMem
VideoYouTube-VOS 2018F-Measure (Unseen)88.7XMem
VideoYouTube-VOS 2018Jaccard (Seen)84.6XMem
VideoYouTube-VOS 2018Jaccard (Unseen)80.2XMem
VideoYouTube-VOS 2018Overall85.7XMem
VideoYouTube-VOS 2018Speed (FPS)22.6XMem
VideoYouTube-VOS 2018F-Measure (Seen)88.5XMem (YouTubeVOS only)
VideoYouTube-VOS 2018F-Measure (Unseen)87.2XMem (YouTubeVOS only)
VideoYouTube-VOS 2018Jaccard (Seen)83.7XMem (YouTubeVOS only)
VideoYouTube-VOS 2018Jaccard (Unseen)78.2XMem (YouTubeVOS only)
VideoYouTube-VOS 2018Overall84.4XMem (YouTubeVOS only)
VideoYouTube-VOS 2018Speed (FPS)22.6XMem (YouTubeVOS only)
Video Object SegmentationDAVIS-2017 (test-dev)F-measure87XMem (BL30K, MS)
Video Object SegmentationDAVIS-2017 (test-dev)Jaccard80.5XMem (BL30K, MS)
Video Object SegmentationDAVIS-2017 (test-dev)Mean Jaccard & F-Measure83.7XMem (BL30K, MS)
Video Object SegmentationDAVIS-2017 (test-dev)F-measure84.5XMem
Video Object SegmentationDAVIS-2017 (test-dev)Jaccard77.4XMem
Video Object SegmentationDAVIS-2017 (test-dev)Mean Jaccard & F-Measure81XMem
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)89.8XMem (BL30K,MS)
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)89.9XMem (BL30K,MS)
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)85.5XMem (BL30K,MS)
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)81.8XMem (BL30K,MS)
Video Object SegmentationYouTube-VOS 2019Mean Jaccard & F-Measure86.8XMem (BL30K,MS)
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)88.6XMem
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)88.6XMem
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)84.3XMem
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)80.3XMem
Video Object SegmentationYouTube-VOS 2019Mean Jaccard & F-Measure85.5XMem
Video Object SegmentationM$^3$-VOSAverage IOU70.4XMem
Video Object SegmentationDAVIS 2016F-Score94.4XMem (BL30K, MS)
Video Object SegmentationDAVIS 2016J&F93.3XMem (BL30K, MS)
Video Object SegmentationDAVIS 2016Jaccard (Mean)92.2XMem (BL30K, MS)
Video Object SegmentationDAVIS 2016F-Score92.7XMem
Video Object SegmentationDAVIS 2016J&F91.5XMem
Video Object SegmentationDAVIS 2016Jaccard (Mean)90.4XMem
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)90.3XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)90.2XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.6XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)81.7XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2018Mean Jaccard & F-Measure86.9XMem (BL30K, MS)
Video Object SegmentationDAVIS 2017 (val)F-measure92.6XMem (BLK30K, MS)
Video Object SegmentationDAVIS 2017 (val)Jaccard86.3XMem (BLK30K, MS)
Video Object SegmentationDAVIS 2017 (val)Mean Jaccard & F-Measure89.5XMem (BLK30K, MS)
Video Object SegmentationDAVIS 2017 (val)F-measure89.5XMem
Video Object SegmentationDAVIS 2017 (val)Jaccard82.9XMem
Video Object SegmentationDAVIS 2017 (val)Mean Jaccard & F-Measure86.2XMem
Video Object SegmentationMOSEF62XMem
Video Object SegmentationMOSEJ53.3XMem
Video Object SegmentationMOSEJ&F57.6XMem
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)92.6XMem (BL30K, MS)
Video Object SegmentationDAVIS 2017 (val)J&F89.5XMem (BL30K, MS)
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)86.3XMem (BL30K, MS)
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)91XMem (MS)
Video Object SegmentationDAVIS 2017 (val)J&F88.2XMem (MS)
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)85.4XMem (MS)
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)91.4XMem (BL30K)
Video Object SegmentationDAVIS 2017 (val)J&F87.7XMem (BL30K)
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)84XMem (BL30K)
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)22.6XMem (BL30K)
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)89.5XMem
Video Object SegmentationDAVIS 2017 (val)J&F86.2XMem
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)82.9XMem
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)22.6XMem
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)87.6XMem (DAVIS and YouTubeVOS only)
Video Object SegmentationDAVIS 2017 (val)J&F84.5XMem (DAVIS and YouTubeVOS only)
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)81.4XMem (DAVIS and YouTubeVOS only)
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)22.6XMem (DAVIS and YouTubeVOS only)
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)79.3XMem (DAVIS only)
Video Object SegmentationDAVIS 2017 (val)J&F76.7XMem (DAVIS only)
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)74.1XMem (DAVIS only)
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)22.6XMem (DAVIS only)
Video Object SegmentationDAVIS 2016F-measure (Mean)94.4XMem (BL30K, MS)
Video Object SegmentationDAVIS 2016J&F93.3XMem (BL30K, MS)
Video Object SegmentationDAVIS 2016Jaccard (Mean)92.2XMem (BL30K, MS)
Video Object SegmentationDAVIS 2016F-measure (Mean)93.5XMem (MS)
Video Object SegmentationDAVIS 2016J&F92.7XMem (MS)
Video Object SegmentationDAVIS 2016Jaccard (Mean)92XMem (MS)
Video Object SegmentationDAVIS 2016F-measure (Mean)93.2XMem (BL30K)
Video Object SegmentationDAVIS 2016J&F92XMem (BL30K)
Video Object SegmentationDAVIS 2016Jaccard (Mean)90.7XMem (BL30K)
Video Object SegmentationDAVIS 2016Speed (FPS)29.6XMem (BL30K)
Video Object SegmentationDAVIS 2016F-measure (Mean)92.7XMem
Video Object SegmentationDAVIS 2016J&F91.5XMem
Video Object SegmentationDAVIS 2016Jaccard (Mean)90.4XMem
Video Object SegmentationDAVIS 2016Speed (FPS)29.6XMem
Video Object SegmentationDAVIS 2016F-measure (Mean)91.9XMem (DAVIS+YouTubeVOS only)
Video Object SegmentationDAVIS 2016J&F90.8XMem (DAVIS+YouTubeVOS only)
Video Object SegmentationDAVIS 2016Jaccard (Mean)89.6XMem (DAVIS+YouTubeVOS only)
Video Object SegmentationDAVIS 2016Speed (FPS)29.6XMem (DAVIS+YouTubeVOS only)
Video Object SegmentationDAVIS 2016F-measure (Mean)88.9XMem (DAVIS only)
Video Object SegmentationDAVIS 2016J&F87.8XMem (DAVIS only)
Video Object SegmentationDAVIS 2016Jaccard (Mean)86.7XMem (DAVIS only)
Video Object SegmentationDAVIS 2016Speed (FPS)29.6XMem (DAVIS only)
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)89.8XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)89.9XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)85.5XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)81.8XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2019Overall86.8XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)89.2XMem (MS)
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)89.8XMem (MS)
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)84.9XMem (MS)
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)81.8XMem (MS)
Video Object SegmentationYouTube-VOS 2019Overall86.4XMem (MS)
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)89.2XMem (BL30K)
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)88.8XMem (BL30K)
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)84.8XMem (BL30K)
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)80.3XMem (BL30K)
Video Object SegmentationYouTube-VOS 2019Overall85.8XMem (BL30K)
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)88XMem
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)87.1XMem
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)83.6XMem
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)78.5XMem
Video Object SegmentationYouTube-VOS 2019Overall84.3XMem
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)87XMem (BL30K, MS)
Video Object SegmentationDAVIS 2017 (test-dev)J&F83.7XMem (BL30K, MS)
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)80.5XMem (BL30K, MS)
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)86.4XMem (MS)
Video Object SegmentationDAVIS 2017 (test-dev)J&F83.1XMem (MS)
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)79.7XMem (MS)
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)85.8XMem (BL30K, 600p)
Video Object SegmentationDAVIS 2017 (test-dev)J&F82.5XMem (BL30K, 600p)
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)79.1XMem (BL30K, 600p)
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)84.7XMem (BL30K)
Video Object SegmentationDAVIS 2017 (test-dev)J&F81.2XMem (BL30K)
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)77.6XMem (BL30K)
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)84.5XMem
Video Object SegmentationDAVIS 2017 (test-dev)J&F81XMem
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)77.4XMem
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)83.4XMem (DAVIS and YouTubeVOS only)
Video Object SegmentationDAVIS 2017 (test-dev)J&F79.8XMem (DAVIS and YouTubeVOS only)
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)76.3XMem (DAVIS and YouTubeVOS only)
Video Object SegmentationDAVIS (no YouTube-VOS training)FPS29.6XMem
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)90.3XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)90.2XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.6XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)81.7XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2018Overall86.9XMem (BL30K, MS)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)89.9XMem (MS)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)89.9XMem (MS)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.3XMem (MS)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)81.7XMem (MS)
Video Object SegmentationYouTube-VOS 2018Overall86.7XMem (MS)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)89.8XMem (BL30K)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)89.2XMem (BL30K)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.1XMem (BL30K)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)80.3XMem (BL30K)
Video Object SegmentationYouTube-VOS 2018Overall86.1XMem (BL30K)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)22.6XMem (BL30K)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)89.3XMem
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)88.7XMem
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)84.6XMem
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)80.2XMem
Video Object SegmentationYouTube-VOS 2018Overall85.7XMem
Video Object SegmentationYouTube-VOS 2018Speed (FPS)22.6XMem
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)88.5XMem (YouTubeVOS only)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)87.2XMem (YouTubeVOS only)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)83.7XMem (YouTubeVOS only)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78.2XMem (YouTubeVOS only)
Video Object SegmentationYouTube-VOS 2018Overall84.4XMem (YouTubeVOS only)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)22.6XMem (YouTubeVOS only)
Semi-Supervised Video Object SegmentationMOSEF62XMem
Semi-Supervised Video Object SegmentationMOSEJ53.3XMem
Semi-Supervised Video Object SegmentationMOSEJ&F57.6XMem
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)92.6XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F89.5XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)86.3XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)91XMem (MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F88.2XMem (MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)85.4XMem (MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)91.4XMem (BL30K)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F87.7XMem (BL30K)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)84XMem (BL30K)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)22.6XMem (BL30K)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)89.5XMem
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F86.2XMem
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)82.9XMem
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)22.6XMem
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)87.6XMem (DAVIS and YouTubeVOS only)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F84.5XMem (DAVIS and YouTubeVOS only)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)81.4XMem (DAVIS and YouTubeVOS only)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)22.6XMem (DAVIS and YouTubeVOS only)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)79.3XMem (DAVIS only)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F76.7XMem (DAVIS only)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)74.1XMem (DAVIS only)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)22.6XMem (DAVIS only)
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)94.4XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationDAVIS 2016J&F93.3XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)92.2XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)93.5XMem (MS)
Semi-Supervised Video Object SegmentationDAVIS 2016J&F92.7XMem (MS)
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)92XMem (MS)
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)93.2XMem (BL30K)
Semi-Supervised Video Object SegmentationDAVIS 2016J&F92XMem (BL30K)
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)90.7XMem (BL30K)
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)29.6XMem (BL30K)
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)92.7XMem
Semi-Supervised Video Object SegmentationDAVIS 2016J&F91.5XMem
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)90.4XMem
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)29.6XMem
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)91.9XMem (DAVIS+YouTubeVOS only)
Semi-Supervised Video Object SegmentationDAVIS 2016J&F90.8XMem (DAVIS+YouTubeVOS only)
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)89.6XMem (DAVIS+YouTubeVOS only)
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)29.6XMem (DAVIS+YouTubeVOS only)
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)88.9XMem (DAVIS only)
Semi-Supervised Video Object SegmentationDAVIS 2016J&F87.8XMem (DAVIS only)
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)86.7XMem (DAVIS only)
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)29.6XMem (DAVIS only)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)89.8XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)89.9XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)85.5XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)81.8XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Overall86.8XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)89.2XMem (MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)89.8XMem (MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)84.9XMem (MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)81.8XMem (MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Overall86.4XMem (MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)89.2XMem (BL30K)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)88.8XMem (BL30K)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)84.8XMem (BL30K)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)80.3XMem (BL30K)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Overall85.8XMem (BL30K)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)88XMem
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)87.1XMem
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)83.6XMem
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)78.5XMem
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Overall84.3XMem
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)87XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F83.7XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)80.5XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)86.4XMem (MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F83.1XMem (MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)79.7XMem (MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)85.8XMem (BL30K, 600p)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F82.5XMem (BL30K, 600p)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)79.1XMem (BL30K, 600p)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)84.7XMem (BL30K)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F81.2XMem (BL30K)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)77.6XMem (BL30K)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)84.5XMem
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F81XMem
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)77.4XMem
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)83.4XMem (DAVIS and YouTubeVOS only)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F79.8XMem (DAVIS and YouTubeVOS only)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)76.3XMem (DAVIS and YouTubeVOS only)
Semi-Supervised Video Object SegmentationDAVIS (no YouTube-VOS training)FPS29.6XMem
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)90.3XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)90.2XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.6XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)81.7XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall86.9XMem (BL30K, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)89.9XMem (MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)89.9XMem (MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.3XMem (MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)81.7XMem (MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall86.7XMem (MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)89.8XMem (BL30K)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)89.2XMem (BL30K)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.1XMem (BL30K)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)80.3XMem (BL30K)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall86.1XMem (BL30K)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)22.6XMem (BL30K)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)89.3XMem
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)88.7XMem
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)84.6XMem
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)80.2XMem
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall85.7XMem
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)22.6XMem
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)88.5XMem (YouTubeVOS only)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)87.2XMem (YouTubeVOS only)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)83.7XMem (YouTubeVOS only)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78.2XMem (YouTubeVOS only)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall84.4XMem (YouTubeVOS only)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)22.6XMem (YouTubeVOS only)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17