Ho Kei Cheng, Alexander G. Schwing
We present XMem, a video object segmentation architecture for long videos with unified feature memory stores inspired by the Atkinson-Shiffrin memory model. Prior work on video object segmentation typically only uses one type of feature memory. For videos longer than a minute, a single feature memory model tightly links memory consumption and accuracy. In contrast, following the Atkinson-Shiffrin model, we develop an architecture that incorporates multiple independent yet deeply-connected feature memory stores: a rapidly updated sensory memory, a high-resolution working memory, and a compact thus sustained long-term memory. Crucially, we develop a memory potentiation algorithm that routinely consolidates actively used working memory elements into the long-term memory, which avoids memory explosion and minimizes performance decay for long-term prediction. Combined with a new memory reading mechanism, XMem greatly exceeds state-of-the-art performance on long-video datasets while being on par with state-of-the-art methods (that do not work on long videos) on short-video datasets. Code is available at https://hkchengrex.github.io/XMem
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Video | DAVIS-2017 (test-dev) | F-measure | 87 | XMem (BL30K, MS) |
| Video | DAVIS-2017 (test-dev) | Jaccard | 80.5 | XMem (BL30K, MS) |
| Video | DAVIS-2017 (test-dev) | Mean Jaccard & F-Measure | 83.7 | XMem (BL30K, MS) |
| Video | DAVIS-2017 (test-dev) | F-measure | 84.5 | XMem |
| Video | DAVIS-2017 (test-dev) | Jaccard | 77.4 | XMem |
| Video | DAVIS-2017 (test-dev) | Mean Jaccard & F-Measure | 81 | XMem |
| Video | YouTube-VOS 2019 | F-Measure (Seen) | 89.8 | XMem (BL30K,MS) |
| Video | YouTube-VOS 2019 | F-Measure (Unseen) | 89.9 | XMem (BL30K,MS) |
| Video | YouTube-VOS 2019 | Jaccard (Seen) | 85.5 | XMem (BL30K,MS) |
| Video | YouTube-VOS 2019 | Jaccard (Unseen) | 81.8 | XMem (BL30K,MS) |
| Video | YouTube-VOS 2019 | Mean Jaccard & F-Measure | 86.8 | XMem (BL30K,MS) |
| Video | YouTube-VOS 2019 | F-Measure (Seen) | 88.6 | XMem |
| Video | YouTube-VOS 2019 | F-Measure (Unseen) | 88.6 | XMem |
| Video | YouTube-VOS 2019 | Jaccard (Seen) | 84.3 | XMem |
| Video | YouTube-VOS 2019 | Jaccard (Unseen) | 80.3 | XMem |
| Video | YouTube-VOS 2019 | Mean Jaccard & F-Measure | 85.5 | XMem |
| Video | M$^3$-VOS | Average IOU | 70.4 | XMem |
| Video | DAVIS 2016 | F-Score | 94.4 | XMem (BL30K, MS) |
| Video | DAVIS 2016 | J&F | 93.3 | XMem (BL30K, MS) |
| Video | DAVIS 2016 | Jaccard (Mean) | 92.2 | XMem (BL30K, MS) |
| Video | DAVIS 2016 | F-Score | 92.7 | XMem |
| Video | DAVIS 2016 | J&F | 91.5 | XMem |
| Video | DAVIS 2016 | Jaccard (Mean) | 90.4 | XMem |
| Video | YouTube-VOS 2018 | F-Measure (Seen) | 90.3 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2018 | F-Measure (Unseen) | 90.2 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2018 | Jaccard (Seen) | 85.6 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2018 | Jaccard (Unseen) | 81.7 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2018 | Mean Jaccard & F-Measure | 86.9 | XMem (BL30K, MS) |
| Video | DAVIS 2017 (val) | F-measure | 92.6 | XMem (BLK30K, MS) |
| Video | DAVIS 2017 (val) | Jaccard | 86.3 | XMem (BLK30K, MS) |
| Video | DAVIS 2017 (val) | Mean Jaccard & F-Measure | 89.5 | XMem (BLK30K, MS) |
| Video | DAVIS 2017 (val) | F-measure | 89.5 | XMem |
| Video | DAVIS 2017 (val) | Jaccard | 82.9 | XMem |
| Video | DAVIS 2017 (val) | Mean Jaccard & F-Measure | 86.2 | XMem |
| Video | MOSE | F | 62 | XMem |
| Video | MOSE | J | 53.3 | XMem |
| Video | MOSE | J&F | 57.6 | XMem |
| Video | DAVIS 2017 (val) | F-measure (Mean) | 92.6 | XMem (BL30K, MS) |
| Video | DAVIS 2017 (val) | J&F | 89.5 | XMem (BL30K, MS) |
| Video | DAVIS 2017 (val) | Jaccard (Mean) | 86.3 | XMem (BL30K, MS) |
| Video | DAVIS 2017 (val) | F-measure (Mean) | 91 | XMem (MS) |
| Video | DAVIS 2017 (val) | J&F | 88.2 | XMem (MS) |
| Video | DAVIS 2017 (val) | Jaccard (Mean) | 85.4 | XMem (MS) |
| Video | DAVIS 2017 (val) | F-measure (Mean) | 91.4 | XMem (BL30K) |
| Video | DAVIS 2017 (val) | J&F | 87.7 | XMem (BL30K) |
| Video | DAVIS 2017 (val) | Jaccard (Mean) | 84 | XMem (BL30K) |
| Video | DAVIS 2017 (val) | Speed (FPS) | 22.6 | XMem (BL30K) |
| Video | DAVIS 2017 (val) | F-measure (Mean) | 89.5 | XMem |
| Video | DAVIS 2017 (val) | J&F | 86.2 | XMem |
| Video | DAVIS 2017 (val) | Jaccard (Mean) | 82.9 | XMem |
| Video | DAVIS 2017 (val) | Speed (FPS) | 22.6 | XMem |
| Video | DAVIS 2017 (val) | F-measure (Mean) | 87.6 | XMem (DAVIS and YouTubeVOS only) |
| Video | DAVIS 2017 (val) | J&F | 84.5 | XMem (DAVIS and YouTubeVOS only) |
| Video | DAVIS 2017 (val) | Jaccard (Mean) | 81.4 | XMem (DAVIS and YouTubeVOS only) |
| Video | DAVIS 2017 (val) | Speed (FPS) | 22.6 | XMem (DAVIS and YouTubeVOS only) |
| Video | DAVIS 2017 (val) | F-measure (Mean) | 79.3 | XMem (DAVIS only) |
| Video | DAVIS 2017 (val) | J&F | 76.7 | XMem (DAVIS only) |
| Video | DAVIS 2017 (val) | Jaccard (Mean) | 74.1 | XMem (DAVIS only) |
| Video | DAVIS 2017 (val) | Speed (FPS) | 22.6 | XMem (DAVIS only) |
| Video | DAVIS 2016 | F-measure (Mean) | 94.4 | XMem (BL30K, MS) |
| Video | DAVIS 2016 | J&F | 93.3 | XMem (BL30K, MS) |
| Video | DAVIS 2016 | Jaccard (Mean) | 92.2 | XMem (BL30K, MS) |
| Video | DAVIS 2016 | F-measure (Mean) | 93.5 | XMem (MS) |
| Video | DAVIS 2016 | J&F | 92.7 | XMem (MS) |
| Video | DAVIS 2016 | Jaccard (Mean) | 92 | XMem (MS) |
| Video | DAVIS 2016 | F-measure (Mean) | 93.2 | XMem (BL30K) |
| Video | DAVIS 2016 | J&F | 92 | XMem (BL30K) |
| Video | DAVIS 2016 | Jaccard (Mean) | 90.7 | XMem (BL30K) |
| Video | DAVIS 2016 | Speed (FPS) | 29.6 | XMem (BL30K) |
| Video | DAVIS 2016 | F-measure (Mean) | 92.7 | XMem |
| Video | DAVIS 2016 | J&F | 91.5 | XMem |
| Video | DAVIS 2016 | Jaccard (Mean) | 90.4 | XMem |
| Video | DAVIS 2016 | Speed (FPS) | 29.6 | XMem |
| Video | DAVIS 2016 | F-measure (Mean) | 91.9 | XMem (DAVIS+YouTubeVOS only) |
| Video | DAVIS 2016 | J&F | 90.8 | XMem (DAVIS+YouTubeVOS only) |
| Video | DAVIS 2016 | Jaccard (Mean) | 89.6 | XMem (DAVIS+YouTubeVOS only) |
| Video | DAVIS 2016 | Speed (FPS) | 29.6 | XMem (DAVIS+YouTubeVOS only) |
| Video | DAVIS 2016 | F-measure (Mean) | 88.9 | XMem (DAVIS only) |
| Video | DAVIS 2016 | J&F | 87.8 | XMem (DAVIS only) |
| Video | DAVIS 2016 | Jaccard (Mean) | 86.7 | XMem (DAVIS only) |
| Video | DAVIS 2016 | Speed (FPS) | 29.6 | XMem (DAVIS only) |
| Video | YouTube-VOS 2019 | F-Measure (Seen) | 89.8 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2019 | F-Measure (Unseen) | 89.9 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2019 | Jaccard (Seen) | 85.5 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2019 | Jaccard (Unseen) | 81.8 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2019 | Overall | 86.8 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2019 | F-Measure (Seen) | 89.2 | XMem (MS) |
| Video | YouTube-VOS 2019 | F-Measure (Unseen) | 89.8 | XMem (MS) |
| Video | YouTube-VOS 2019 | Jaccard (Seen) | 84.9 | XMem (MS) |
| Video | YouTube-VOS 2019 | Jaccard (Unseen) | 81.8 | XMem (MS) |
| Video | YouTube-VOS 2019 | Overall | 86.4 | XMem (MS) |
| Video | YouTube-VOS 2019 | F-Measure (Seen) | 89.2 | XMem (BL30K) |
| Video | YouTube-VOS 2019 | F-Measure (Unseen) | 88.8 | XMem (BL30K) |
| Video | YouTube-VOS 2019 | Jaccard (Seen) | 84.8 | XMem (BL30K) |
| Video | YouTube-VOS 2019 | Jaccard (Unseen) | 80.3 | XMem (BL30K) |
| Video | YouTube-VOS 2019 | Overall | 85.8 | XMem (BL30K) |
| Video | YouTube-VOS 2019 | F-Measure (Seen) | 88 | XMem |
| Video | YouTube-VOS 2019 | F-Measure (Unseen) | 87.1 | XMem |
| Video | YouTube-VOS 2019 | Jaccard (Seen) | 83.6 | XMem |
| Video | YouTube-VOS 2019 | Jaccard (Unseen) | 78.5 | XMem |
| Video | YouTube-VOS 2019 | Overall | 84.3 | XMem |
| Video | DAVIS 2017 (test-dev) | F-measure (Mean) | 87 | XMem (BL30K, MS) |
| Video | DAVIS 2017 (test-dev) | J&F | 83.7 | XMem (BL30K, MS) |
| Video | DAVIS 2017 (test-dev) | Jaccard (Mean) | 80.5 | XMem (BL30K, MS) |
| Video | DAVIS 2017 (test-dev) | F-measure (Mean) | 86.4 | XMem (MS) |
| Video | DAVIS 2017 (test-dev) | J&F | 83.1 | XMem (MS) |
| Video | DAVIS 2017 (test-dev) | Jaccard (Mean) | 79.7 | XMem (MS) |
| Video | DAVIS 2017 (test-dev) | F-measure (Mean) | 85.8 | XMem (BL30K, 600p) |
| Video | DAVIS 2017 (test-dev) | J&F | 82.5 | XMem (BL30K, 600p) |
| Video | DAVIS 2017 (test-dev) | Jaccard (Mean) | 79.1 | XMem (BL30K, 600p) |
| Video | DAVIS 2017 (test-dev) | F-measure (Mean) | 84.7 | XMem (BL30K) |
| Video | DAVIS 2017 (test-dev) | J&F | 81.2 | XMem (BL30K) |
| Video | DAVIS 2017 (test-dev) | Jaccard (Mean) | 77.6 | XMem (BL30K) |
| Video | DAVIS 2017 (test-dev) | F-measure (Mean) | 84.5 | XMem |
| Video | DAVIS 2017 (test-dev) | J&F | 81 | XMem |
| Video | DAVIS 2017 (test-dev) | Jaccard (Mean) | 77.4 | XMem |
| Video | DAVIS 2017 (test-dev) | F-measure (Mean) | 83.4 | XMem (DAVIS and YouTubeVOS only) |
| Video | DAVIS 2017 (test-dev) | J&F | 79.8 | XMem (DAVIS and YouTubeVOS only) |
| Video | DAVIS 2017 (test-dev) | Jaccard (Mean) | 76.3 | XMem (DAVIS and YouTubeVOS only) |
| Video | DAVIS (no YouTube-VOS training) | FPS | 29.6 | XMem |
| Video | YouTube-VOS 2018 | F-Measure (Seen) | 90.3 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2018 | F-Measure (Unseen) | 90.2 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2018 | Jaccard (Seen) | 85.6 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2018 | Jaccard (Unseen) | 81.7 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2018 | Overall | 86.9 | XMem (BL30K, MS) |
| Video | YouTube-VOS 2018 | F-Measure (Seen) | 89.9 | XMem (MS) |
| Video | YouTube-VOS 2018 | F-Measure (Unseen) | 89.9 | XMem (MS) |
| Video | YouTube-VOS 2018 | Jaccard (Seen) | 85.3 | XMem (MS) |
| Video | YouTube-VOS 2018 | Jaccard (Unseen) | 81.7 | XMem (MS) |
| Video | YouTube-VOS 2018 | Overall | 86.7 | XMem (MS) |
| Video | YouTube-VOS 2018 | F-Measure (Seen) | 89.8 | XMem (BL30K) |
| Video | YouTube-VOS 2018 | F-Measure (Unseen) | 89.2 | XMem (BL30K) |
| Video | YouTube-VOS 2018 | Jaccard (Seen) | 85.1 | XMem (BL30K) |
| Video | YouTube-VOS 2018 | Jaccard (Unseen) | 80.3 | XMem (BL30K) |
| Video | YouTube-VOS 2018 | Overall | 86.1 | XMem (BL30K) |
| Video | YouTube-VOS 2018 | Speed (FPS) | 22.6 | XMem (BL30K) |
| Video | YouTube-VOS 2018 | F-Measure (Seen) | 89.3 | XMem |
| Video | YouTube-VOS 2018 | F-Measure (Unseen) | 88.7 | XMem |
| Video | YouTube-VOS 2018 | Jaccard (Seen) | 84.6 | XMem |
| Video | YouTube-VOS 2018 | Jaccard (Unseen) | 80.2 | XMem |
| Video | YouTube-VOS 2018 | Overall | 85.7 | XMem |
| Video | YouTube-VOS 2018 | Speed (FPS) | 22.6 | XMem |
| Video | YouTube-VOS 2018 | F-Measure (Seen) | 88.5 | XMem (YouTubeVOS only) |
| Video | YouTube-VOS 2018 | F-Measure (Unseen) | 87.2 | XMem (YouTubeVOS only) |
| Video | YouTube-VOS 2018 | Jaccard (Seen) | 83.7 | XMem (YouTubeVOS only) |
| Video | YouTube-VOS 2018 | Jaccard (Unseen) | 78.2 | XMem (YouTubeVOS only) |
| Video | YouTube-VOS 2018 | Overall | 84.4 | XMem (YouTubeVOS only) |
| Video | YouTube-VOS 2018 | Speed (FPS) | 22.6 | XMem (YouTubeVOS only) |
| Video Object Segmentation | DAVIS-2017 (test-dev) | F-measure | 87 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS-2017 (test-dev) | Jaccard | 80.5 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS-2017 (test-dev) | Mean Jaccard & F-Measure | 83.7 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS-2017 (test-dev) | F-measure | 84.5 | XMem |
| Video Object Segmentation | DAVIS-2017 (test-dev) | Jaccard | 77.4 | XMem |
| Video Object Segmentation | DAVIS-2017 (test-dev) | Mean Jaccard & F-Measure | 81 | XMem |
| Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Seen) | 89.8 | XMem (BL30K,MS) |
| Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Unseen) | 89.9 | XMem (BL30K,MS) |
| Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Seen) | 85.5 | XMem (BL30K,MS) |
| Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Unseen) | 81.8 | XMem (BL30K,MS) |
| Video Object Segmentation | YouTube-VOS 2019 | Mean Jaccard & F-Measure | 86.8 | XMem (BL30K,MS) |
| Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Seen) | 88.6 | XMem |
| Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Unseen) | 88.6 | XMem |
| Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Seen) | 84.3 | XMem |
| Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Unseen) | 80.3 | XMem |
| Video Object Segmentation | YouTube-VOS 2019 | Mean Jaccard & F-Measure | 85.5 | XMem |
| Video Object Segmentation | M$^3$-VOS | Average IOU | 70.4 | XMem |
| Video Object Segmentation | DAVIS 2016 | F-Score | 94.4 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2016 | J&F | 93.3 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 92.2 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2016 | F-Score | 92.7 | XMem |
| Video Object Segmentation | DAVIS 2016 | J&F | 91.5 | XMem |
| Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 90.4 | XMem |
| Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Seen) | 90.3 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Unseen) | 90.2 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Seen) | 85.6 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Unseen) | 81.7 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2018 | Mean Jaccard & F-Measure | 86.9 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2017 (val) | F-measure | 92.6 | XMem (BLK30K, MS) |
| Video Object Segmentation | DAVIS 2017 (val) | Jaccard | 86.3 | XMem (BLK30K, MS) |
| Video Object Segmentation | DAVIS 2017 (val) | Mean Jaccard & F-Measure | 89.5 | XMem (BLK30K, MS) |
| Video Object Segmentation | DAVIS 2017 (val) | F-measure | 89.5 | XMem |
| Video Object Segmentation | DAVIS 2017 (val) | Jaccard | 82.9 | XMem |
| Video Object Segmentation | DAVIS 2017 (val) | Mean Jaccard & F-Measure | 86.2 | XMem |
| Video Object Segmentation | MOSE | F | 62 | XMem |
| Video Object Segmentation | MOSE | J | 53.3 | XMem |
| Video Object Segmentation | MOSE | J&F | 57.6 | XMem |
| Video Object Segmentation | DAVIS 2017 (val) | F-measure (Mean) | 92.6 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2017 (val) | J&F | 89.5 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2017 (val) | Jaccard (Mean) | 86.3 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2017 (val) | F-measure (Mean) | 91 | XMem (MS) |
| Video Object Segmentation | DAVIS 2017 (val) | J&F | 88.2 | XMem (MS) |
| Video Object Segmentation | DAVIS 2017 (val) | Jaccard (Mean) | 85.4 | XMem (MS) |
| Video Object Segmentation | DAVIS 2017 (val) | F-measure (Mean) | 91.4 | XMem (BL30K) |
| Video Object Segmentation | DAVIS 2017 (val) | J&F | 87.7 | XMem (BL30K) |
| Video Object Segmentation | DAVIS 2017 (val) | Jaccard (Mean) | 84 | XMem (BL30K) |
| Video Object Segmentation | DAVIS 2017 (val) | Speed (FPS) | 22.6 | XMem (BL30K) |
| Video Object Segmentation | DAVIS 2017 (val) | F-measure (Mean) | 89.5 | XMem |
| Video Object Segmentation | DAVIS 2017 (val) | J&F | 86.2 | XMem |
| Video Object Segmentation | DAVIS 2017 (val) | Jaccard (Mean) | 82.9 | XMem |
| Video Object Segmentation | DAVIS 2017 (val) | Speed (FPS) | 22.6 | XMem |
| Video Object Segmentation | DAVIS 2017 (val) | F-measure (Mean) | 87.6 | XMem (DAVIS and YouTubeVOS only) |
| Video Object Segmentation | DAVIS 2017 (val) | J&F | 84.5 | XMem (DAVIS and YouTubeVOS only) |
| Video Object Segmentation | DAVIS 2017 (val) | Jaccard (Mean) | 81.4 | XMem (DAVIS and YouTubeVOS only) |
| Video Object Segmentation | DAVIS 2017 (val) | Speed (FPS) | 22.6 | XMem (DAVIS and YouTubeVOS only) |
| Video Object Segmentation | DAVIS 2017 (val) | F-measure (Mean) | 79.3 | XMem (DAVIS only) |
| Video Object Segmentation | DAVIS 2017 (val) | J&F | 76.7 | XMem (DAVIS only) |
| Video Object Segmentation | DAVIS 2017 (val) | Jaccard (Mean) | 74.1 | XMem (DAVIS only) |
| Video Object Segmentation | DAVIS 2017 (val) | Speed (FPS) | 22.6 | XMem (DAVIS only) |
| Video Object Segmentation | DAVIS 2016 | F-measure (Mean) | 94.4 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2016 | J&F | 93.3 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 92.2 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2016 | F-measure (Mean) | 93.5 | XMem (MS) |
| Video Object Segmentation | DAVIS 2016 | J&F | 92.7 | XMem (MS) |
| Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 92 | XMem (MS) |
| Video Object Segmentation | DAVIS 2016 | F-measure (Mean) | 93.2 | XMem (BL30K) |
| Video Object Segmentation | DAVIS 2016 | J&F | 92 | XMem (BL30K) |
| Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 90.7 | XMem (BL30K) |
| Video Object Segmentation | DAVIS 2016 | Speed (FPS) | 29.6 | XMem (BL30K) |
| Video Object Segmentation | DAVIS 2016 | F-measure (Mean) | 92.7 | XMem |
| Video Object Segmentation | DAVIS 2016 | J&F | 91.5 | XMem |
| Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 90.4 | XMem |
| Video Object Segmentation | DAVIS 2016 | Speed (FPS) | 29.6 | XMem |
| Video Object Segmentation | DAVIS 2016 | F-measure (Mean) | 91.9 | XMem (DAVIS+YouTubeVOS only) |
| Video Object Segmentation | DAVIS 2016 | J&F | 90.8 | XMem (DAVIS+YouTubeVOS only) |
| Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 89.6 | XMem (DAVIS+YouTubeVOS only) |
| Video Object Segmentation | DAVIS 2016 | Speed (FPS) | 29.6 | XMem (DAVIS+YouTubeVOS only) |
| Video Object Segmentation | DAVIS 2016 | F-measure (Mean) | 88.9 | XMem (DAVIS only) |
| Video Object Segmentation | DAVIS 2016 | J&F | 87.8 | XMem (DAVIS only) |
| Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 86.7 | XMem (DAVIS only) |
| Video Object Segmentation | DAVIS 2016 | Speed (FPS) | 29.6 | XMem (DAVIS only) |
| Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Seen) | 89.8 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Unseen) | 89.9 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Seen) | 85.5 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Unseen) | 81.8 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2019 | Overall | 86.8 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Seen) | 89.2 | XMem (MS) |
| Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Unseen) | 89.8 | XMem (MS) |
| Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Seen) | 84.9 | XMem (MS) |
| Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Unseen) | 81.8 | XMem (MS) |
| Video Object Segmentation | YouTube-VOS 2019 | Overall | 86.4 | XMem (MS) |
| Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Seen) | 89.2 | XMem (BL30K) |
| Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Unseen) | 88.8 | XMem (BL30K) |
| Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Seen) | 84.8 | XMem (BL30K) |
| Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Unseen) | 80.3 | XMem (BL30K) |
| Video Object Segmentation | YouTube-VOS 2019 | Overall | 85.8 | XMem (BL30K) |
| Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Seen) | 88 | XMem |
| Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Unseen) | 87.1 | XMem |
| Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Seen) | 83.6 | XMem |
| Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Unseen) | 78.5 | XMem |
| Video Object Segmentation | YouTube-VOS 2019 | Overall | 84.3 | XMem |
| Video Object Segmentation | DAVIS 2017 (test-dev) | F-measure (Mean) | 87 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | J&F | 83.7 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | Jaccard (Mean) | 80.5 | XMem (BL30K, MS) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | F-measure (Mean) | 86.4 | XMem (MS) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | J&F | 83.1 | XMem (MS) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | Jaccard (Mean) | 79.7 | XMem (MS) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | F-measure (Mean) | 85.8 | XMem (BL30K, 600p) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | J&F | 82.5 | XMem (BL30K, 600p) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | Jaccard (Mean) | 79.1 | XMem (BL30K, 600p) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | F-measure (Mean) | 84.7 | XMem (BL30K) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | J&F | 81.2 | XMem (BL30K) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | Jaccard (Mean) | 77.6 | XMem (BL30K) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | F-measure (Mean) | 84.5 | XMem |
| Video Object Segmentation | DAVIS 2017 (test-dev) | J&F | 81 | XMem |
| Video Object Segmentation | DAVIS 2017 (test-dev) | Jaccard (Mean) | 77.4 | XMem |
| Video Object Segmentation | DAVIS 2017 (test-dev) | F-measure (Mean) | 83.4 | XMem (DAVIS and YouTubeVOS only) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | J&F | 79.8 | XMem (DAVIS and YouTubeVOS only) |
| Video Object Segmentation | DAVIS 2017 (test-dev) | Jaccard (Mean) | 76.3 | XMem (DAVIS and YouTubeVOS only) |
| Video Object Segmentation | DAVIS (no YouTube-VOS training) | FPS | 29.6 | XMem |
| Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Seen) | 90.3 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Unseen) | 90.2 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Seen) | 85.6 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Unseen) | 81.7 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2018 | Overall | 86.9 | XMem (BL30K, MS) |
| Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Seen) | 89.9 | XMem (MS) |
| Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Unseen) | 89.9 | XMem (MS) |
| Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Seen) | 85.3 | XMem (MS) |
| Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Unseen) | 81.7 | XMem (MS) |
| Video Object Segmentation | YouTube-VOS 2018 | Overall | 86.7 | XMem (MS) |
| Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Seen) | 89.8 | XMem (BL30K) |
| Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Unseen) | 89.2 | XMem (BL30K) |
| Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Seen) | 85.1 | XMem (BL30K) |
| Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Unseen) | 80.3 | XMem (BL30K) |
| Video Object Segmentation | YouTube-VOS 2018 | Overall | 86.1 | XMem (BL30K) |
| Video Object Segmentation | YouTube-VOS 2018 | Speed (FPS) | 22.6 | XMem (BL30K) |
| Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Seen) | 89.3 | XMem |
| Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Unseen) | 88.7 | XMem |
| Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Seen) | 84.6 | XMem |
| Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Unseen) | 80.2 | XMem |
| Video Object Segmentation | YouTube-VOS 2018 | Overall | 85.7 | XMem |
| Video Object Segmentation | YouTube-VOS 2018 | Speed (FPS) | 22.6 | XMem |
| Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Seen) | 88.5 | XMem (YouTubeVOS only) |
| Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Unseen) | 87.2 | XMem (YouTubeVOS only) |
| Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Seen) | 83.7 | XMem (YouTubeVOS only) |
| Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Unseen) | 78.2 | XMem (YouTubeVOS only) |
| Video Object Segmentation | YouTube-VOS 2018 | Overall | 84.4 | XMem (YouTubeVOS only) |
| Video Object Segmentation | YouTube-VOS 2018 | Speed (FPS) | 22.6 | XMem (YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | MOSE | F | 62 | XMem |
| Semi-Supervised Video Object Segmentation | MOSE | J | 53.3 | XMem |
| Semi-Supervised Video Object Segmentation | MOSE | J&F | 57.6 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | F-measure (Mean) | 92.6 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | J&F | 89.5 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | Jaccard (Mean) | 86.3 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | F-measure (Mean) | 91 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | J&F | 88.2 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | Jaccard (Mean) | 85.4 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | F-measure (Mean) | 91.4 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | J&F | 87.7 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | Jaccard (Mean) | 84 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | Speed (FPS) | 22.6 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | F-measure (Mean) | 89.5 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | J&F | 86.2 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | Jaccard (Mean) | 82.9 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | Speed (FPS) | 22.6 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | F-measure (Mean) | 87.6 | XMem (DAVIS and YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | J&F | 84.5 | XMem (DAVIS and YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | Jaccard (Mean) | 81.4 | XMem (DAVIS and YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | Speed (FPS) | 22.6 | XMem (DAVIS and YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | F-measure (Mean) | 79.3 | XMem (DAVIS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | J&F | 76.7 | XMem (DAVIS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | Jaccard (Mean) | 74.1 | XMem (DAVIS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (val) | Speed (FPS) | 22.6 | XMem (DAVIS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | F-measure (Mean) | 94.4 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | J&F | 93.3 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 92.2 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | F-measure (Mean) | 93.5 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | J&F | 92.7 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 92 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | F-measure (Mean) | 93.2 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | J&F | 92 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 90.7 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | Speed (FPS) | 29.6 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | F-measure (Mean) | 92.7 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | J&F | 91.5 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 90.4 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | Speed (FPS) | 29.6 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | F-measure (Mean) | 91.9 | XMem (DAVIS+YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | J&F | 90.8 | XMem (DAVIS+YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 89.6 | XMem (DAVIS+YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | Speed (FPS) | 29.6 | XMem (DAVIS+YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | F-measure (Mean) | 88.9 | XMem (DAVIS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | J&F | 87.8 | XMem (DAVIS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | Jaccard (Mean) | 86.7 | XMem (DAVIS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2016 | Speed (FPS) | 29.6 | XMem (DAVIS only) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Seen) | 89.8 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Unseen) | 89.9 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Seen) | 85.5 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Unseen) | 81.8 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | Overall | 86.8 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Seen) | 89.2 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Unseen) | 89.8 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Seen) | 84.9 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Unseen) | 81.8 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | Overall | 86.4 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Seen) | 89.2 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Unseen) | 88.8 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Seen) | 84.8 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Unseen) | 80.3 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | Overall | 85.8 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Seen) | 88 | XMem |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | F-Measure (Unseen) | 87.1 | XMem |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Seen) | 83.6 | XMem |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | Jaccard (Unseen) | 78.5 | XMem |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2019 | Overall | 84.3 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | F-measure (Mean) | 87 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | J&F | 83.7 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | Jaccard (Mean) | 80.5 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | F-measure (Mean) | 86.4 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | J&F | 83.1 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | Jaccard (Mean) | 79.7 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | F-measure (Mean) | 85.8 | XMem (BL30K, 600p) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | J&F | 82.5 | XMem (BL30K, 600p) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | Jaccard (Mean) | 79.1 | XMem (BL30K, 600p) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | F-measure (Mean) | 84.7 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | J&F | 81.2 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | Jaccard (Mean) | 77.6 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | F-measure (Mean) | 84.5 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | J&F | 81 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | Jaccard (Mean) | 77.4 | XMem |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | F-measure (Mean) | 83.4 | XMem (DAVIS and YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | J&F | 79.8 | XMem (DAVIS and YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | DAVIS 2017 (test-dev) | Jaccard (Mean) | 76.3 | XMem (DAVIS and YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | DAVIS (no YouTube-VOS training) | FPS | 29.6 | XMem |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Seen) | 90.3 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Unseen) | 90.2 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Seen) | 85.6 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Unseen) | 81.7 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Overall | 86.9 | XMem (BL30K, MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Seen) | 89.9 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Unseen) | 89.9 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Seen) | 85.3 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Unseen) | 81.7 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Overall | 86.7 | XMem (MS) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Seen) | 89.8 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Unseen) | 89.2 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Seen) | 85.1 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Unseen) | 80.3 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Overall | 86.1 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Speed (FPS) | 22.6 | XMem (BL30K) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Seen) | 89.3 | XMem |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Unseen) | 88.7 | XMem |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Seen) | 84.6 | XMem |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Unseen) | 80.2 | XMem |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Overall | 85.7 | XMem |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Speed (FPS) | 22.6 | XMem |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Seen) | 88.5 | XMem (YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | F-Measure (Unseen) | 87.2 | XMem (YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Seen) | 83.7 | XMem (YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Jaccard (Unseen) | 78.2 | XMem (YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Overall | 84.4 | XMem (YouTubeVOS only) |
| Semi-Supervised Video Object Segmentation | YouTube-VOS 2018 | Speed (FPS) | 22.6 | XMem (YouTubeVOS only) |