TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Scalable Video Object Segmentation with Identification Mec...

Scalable Video Object Segmentation with Identification Mechanism

Zongxin Yang, Jiaxu Miao, Yunchao Wei, Wenguan Wang, Xiaohan Wang, Yi Yang

2022-03-22Semi-Supervised Video Object SegmentationSegmentationSemantic SegmentationVideo Object SegmentationVideo Semantic Segmentation
PaperPDFCode(official)Code(official)

Abstract

This paper delves into the challenges of achieving scalable and effective multi-object modeling for semi-supervised Video Object Segmentation (VOS). Previous VOS methods decode features with a single positive object, limiting the learning of multi-object representation as they must match and segment each target separately under multi-object scenarios. Additionally, earlier techniques catered to specific application objectives and lacked the flexibility to fulfill different speed-accuracy requirements. To address these problems, we present two innovative approaches, Associating Objects with Transformers (AOT) and Associating Objects with Scalable Transformers (AOST). In pursuing effective multi-object modeling, AOT introduces the IDentification (ID) mechanism to allocate each object a unique identity. This approach enables the network to model the associations among all objects simultaneously, thus facilitating the tracking and segmentation of objects in a single network pass. To address the challenge of inflexible deployment, AOST further integrates scalable long short-term transformers that incorporate scalable supervision and layer-wise ID-based attention. This enables online architecture scalability in VOS for the first time and overcomes ID embeddings' representation limitations. Given the absence of a benchmark for VOS involving densely multi-object annotations, we propose a challenging Video Object Segmentation in the Wild (VOSW) benchmark to validate our approaches. We evaluated various AOT and AOST variants using extensive experiments across VOSW and five commonly used VOS benchmarks, including YouTube-VOS 2018 & 2019 Val, DAVIS-2017 Val & Test, and DAVIS-2016. Our approaches surpass the state-of-the-art competitors and display exceptional efficiency and scalability consistently across all six benchmarks. Project page: https://github.com/yoxu515/aot-benchmark.

Results

TaskDatasetMetricValueModel
VideoDAVIS 2017 (val)F-measure (Mean)89.8SwinB-AOTv2-L (MS)
VideoDAVIS 2017 (val)J&F87SwinB-AOTv2-L (MS)
VideoDAVIS 2017 (val)Jaccard (Mean)84.2SwinB-AOTv2-L (MS)
VideoDAVIS 2017 (val)Params(M)65.6SwinB-AOTv2-L (MS)
VideoDAVIS 2017 (val)Speed (FPS)1.3SwinB-AOTv2-L (MS)
VideoDAVIS 2017 (val)F-measure (Mean)89.5SwinB-AOST (L'=3, MS)
VideoDAVIS 2017 (val)J&F86.7SwinB-AOST (L'=3, MS)
VideoDAVIS 2017 (val)Jaccard (Mean)83.8SwinB-AOST (L'=3, MS)
VideoDAVIS 2017 (val)Params(M)65.6SwinB-AOST (L'=3, MS)
VideoDAVIS 2017 (val)Speed (FPS)1.3SwinB-AOST (L'=3, MS)
VideoDAVIS 2017 (val)F-measure (Mean)89.4SwinB-AOTv2-L
VideoDAVIS 2017 (val)J&F86.3SwinB-AOTv2-L
VideoDAVIS 2017 (val)Jaccard (Mean)83.1SwinB-AOTv2-L
VideoDAVIS 2017 (val)Params(M)65.6SwinB-AOTv2-L
VideoDAVIS 2017 (val)Speed (FPS)12SwinB-AOTv2-L
VideoDAVIS 2017 (val)F-measure (Mean)88.5R50-AOST (L'=3)
VideoDAVIS 2017 (val)J&F85.6R50-AOST (L'=3)
VideoDAVIS 2017 (val)Jaccard (Mean)82.6R50-AOST (L'=3)
VideoDAVIS 2017 (val)Params(M)15.4R50-AOST (L'=3)
VideoDAVIS 2017 (val)Speed (FPS)17.5R50-AOST (L'=3)
VideoDAVIS 2017 (val)F-measure (Mean)88R50-AOST (L'=2)
VideoDAVIS 2017 (val)J&F85.3R50-AOST (L'=2)
VideoDAVIS 2017 (val)Jaccard (Mean)82.5R50-AOST (L'=2)
VideoDAVIS 2017 (val)Params(M)13.9R50-AOST (L'=2)
VideoDAVIS 2017 (val)Speed (FPS)24.3R50-AOST (L'=2)
VideoDAVIS 2017 (val)F-measure (Mean)86.1R50-AOST (L'=1)
VideoDAVIS 2017 (val)J&F83.7R50-AOST (L'=1)
VideoDAVIS 2017 (val)Jaccard (Mean)81.2R50-AOST (L'=1)
VideoDAVIS 2017 (val)Params(M)12.5R50-AOST (L'=1)
VideoDAVIS 2017 (val)Speed (FPS)37.4R50-AOST (L'=1)
VideoDAVIS 2016F-measure (Mean)94.4SwinB-AOTv2-L (MS)
VideoDAVIS 2016J&F93SwinB-AOTv2-L (MS)
VideoDAVIS 2016Jaccard (Mean)91.6SwinB-AOTv2-L (MS)
VideoDAVIS 2016Speed (FPS)1.3SwinB-AOTv2-L (MS)
VideoDAVIS 2016F-measure (Mean)94.5SwinB-AOST (L'=3, MS)
VideoDAVIS 2016J&F93SwinB-AOST (L'=3, MS)
VideoDAVIS 2016Jaccard (Mean)91.5SwinB-AOST (L'=3, MS)
VideoDAVIS 2016Speed (FPS)1.3SwinB-AOST (L'=3, MS)
VideoDAVIS 2016F-measure (Mean)94.1SwinB-AOTv2-L
VideoDAVIS 2016J&F92.4SwinB-AOTv2-L
VideoDAVIS 2016Jaccard (Mean)90.6SwinB-AOTv2-L
VideoDAVIS 2016Speed (FPS)12SwinB-AOTv2-L
VideoDAVIS 2016F-measure (Mean)94.2SwinB-AOST (L'=3)
VideoDAVIS 2016J&F92.4SwinB-AOST (L'=3)
VideoDAVIS 2016Jaccard (Mean)90.5SwinB-AOST (L'=3)
VideoDAVIS 2016Speed (FPS)12SwinB-AOST (L'=3)
VideoDAVIS 2016F-measure (Mean)93.6R50-AOST (L'=3)
VideoDAVIS 2016J&F92.1R50-AOST (L'=3)
VideoDAVIS 2016Jaccard (Mean)90.6R50-AOST (L'=3)
VideoDAVIS 2016Speed (FPS)17.5R50-AOST (L'=3)
VideoDAVIS 2016F-measure (Mean)93.4R50-AOST (L'=2)
VideoDAVIS 2016J&F92R50-AOST (L'=2)
VideoDAVIS 2016Jaccard (Mean)90.5R50-AOST (L'=2)
VideoDAVIS 2016Speed (FPS)24.3R50-AOST (L'=2)
VideoDAVIS 2016F-measure (Mean)90.9R50-AOST (L'=1)
VideoDAVIS 2016J&F90.3R50-AOST (L'=1)
VideoDAVIS 2016Jaccard (Mean)89.6R50-AOST (L'=1)
VideoDAVIS 2016Speed (FPS)37.4R50-AOST (L'=1)
VideoYouTube-VOS 2019F-Measure (Seen)90.3SwinB-AOTv2-L (all frames, MS)
VideoYouTube-VOS 2019F-Measure (Unseen)89.1SwinB-AOTv2-L (all frames, MS)
VideoYouTube-VOS 2019Jaccard (Seen)85.5SwinB-AOTv2-L (all frames, MS)
VideoYouTube-VOS 2019Jaccard (Unseen)81SwinB-AOTv2-L (all frames, MS)
VideoYouTube-VOS 2019Overall86.5SwinB-AOTv2-L (all frames, MS)
VideoYouTube-VOS 2019F-Measure (Seen)88.9SwinB-AOTv2-L (all frames)
VideoYouTube-VOS 2019F-Measure (Unseen)88SwinB-AOTv2-L (all frames)
VideoYouTube-VOS 2019Jaccard (Seen)84.2SwinB-AOTv2-L (all frames)
VideoYouTube-VOS 2019Jaccard (Unseen)79.8SwinB-AOTv2-L (all frames)
VideoYouTube-VOS 2019Overall85.2SwinB-AOTv2-L (all frames)
VideoYouTube-VOS 2019F-Measure (Seen)88.7R50-AOST (L'=3)
VideoYouTube-VOS 2019F-Measure (Unseen)87.7R50-AOST (L'=3)
VideoYouTube-VOS 2019Jaccard (Seen)83.8R50-AOST (L'=3)
VideoYouTube-VOS 2019Jaccard (Unseen)79.3R50-AOST (L'=3)
VideoYouTube-VOS 2019Overall84.9R50-AOST (L'=3)
VideoYouTube-VOS 2019F-Measure (Seen)88R50-AOST (L'=2)
VideoYouTube-VOS 2019F-Measure (Unseen)87.1R50-AOST (L'=2)
VideoYouTube-VOS 2019Jaccard (Seen)83.3R50-AOST (L'=2)
VideoYouTube-VOS 2019Jaccard (Unseen)78.9R50-AOST (L'=2)
VideoYouTube-VOS 2019Overall84.3R50-AOST (L'=2)
VideoYouTube-VOS 2019F-Measure (Seen)85.6R50-AOST (L'=1)
VideoYouTube-VOS 2019F-Measure (Unseen)83.8R50-AOST (L'=1)
VideoYouTube-VOS 2019Jaccard (Seen)81R50-AOST (L'=1)
VideoYouTube-VOS 2019Jaccard (Unseen)754.8R50-AOST (L'=1)
VideoYouTube-VOS 2019Overall81.5R50-AOST (L'=1)
VideoDAVIS 2017 (test-dev)F-measure (Mean)88.5SwinB-AOST (L'=3, MS)
VideoDAVIS 2017 (test-dev)FPS1.3SwinB-AOST (L'=3, MS)
VideoDAVIS 2017 (test-dev)J&F84.7SwinB-AOST (L'=3, MS)
VideoDAVIS 2017 (test-dev)Jaccard (Mean)80.9SwinB-AOST (L'=3, MS)
VideoDAVIS 2017 (test-dev)F-measure (Mean)87.9SwinB-AOTv2-L
VideoDAVIS 2017 (test-dev)FPS1.3SwinB-AOTv2-L
VideoDAVIS 2017 (test-dev)J&F84.5SwinB-AOTv2-L
VideoDAVIS 2017 (test-dev)Jaccard (Mean)81SwinB-AOTv2-L
VideoDAVIS 2017 (test-dev)F-measure (Mean)86.6SwinB-AOST (L'=3)
VideoDAVIS 2017 (test-dev)FPS12SwinB-AOST (L'=3)
VideoDAVIS 2017 (test-dev)J&F82.7SwinB-AOST (L'=3)
VideoDAVIS 2017 (test-dev)Jaccard (Mean)78.8SwinB-AOST (L'=3)
VideoDAVIS 2017 (test-dev)F-measure (Mean)83.6R50-AOST (L'=3)
VideoDAVIS 2017 (test-dev)FPS17.5R50-AOST (L'=3)
VideoDAVIS 2017 (test-dev)J&F79.9R50-AOST (L'=3)
VideoDAVIS 2017 (test-dev)Jaccard (Mean)76.2R50-AOST (L'=3)
VideoDAVIS 2017 (test-dev)F-measure (Mean)81.7R50-AOST (L'=2)
VideoDAVIS 2017 (test-dev)FPS24.3R50-AOST (L'=2)
VideoDAVIS 2017 (test-dev)J&F78.1R50-AOST (L'=2)
VideoDAVIS 2017 (test-dev)Jaccard (Mean)74.5R50-AOST (L'=2)
VideoYouTube-VOS 2018F-Measure (Seen)90.7SwinB-AOTv2-L (all frames, MS)
VideoYouTube-VOS 2018F-Measure (Unseen)88.9SwinB-AOTv2-L (all frames, MS)
VideoYouTube-VOS 2018Jaccard (Seen)85.6SwinB-AOTv2-L (all frames, MS)
VideoYouTube-VOS 2018Jaccard (Unseen)80.7SwinB-AOTv2-L (all frames, MS)
VideoYouTube-VOS 2018Overall86.5SwinB-AOTv2-L (all frames, MS)
VideoYouTube-VOS 2018Params(M)65.6SwinB-AOTv2-L (all frames, MS)
VideoYouTube-VOS 2018Speed (FPS)0.7SwinB-AOTv2-L (all frames, MS)
VideoYouTube-VOS 2018F-Measure (Seen)90.1SwinB-AOTv2-L (all frames)
VideoYouTube-VOS 2018F-Measure (Unseen)88.2SwinB-AOTv2-L (all frames)
VideoYouTube-VOS 2018Jaccard (Unseen)79.6SwinB-AOTv2-L (all frames)
VideoYouTube-VOS 2018Overall85.8SwinB-AOTv2-L (all frames)
VideoYouTube-VOS 2018Speed (FPS)5.1SwinB-AOTv2-L (all frames)
VideoYouTube-VOS 2018F-Measure (Seen)90.2R50-AOTv2-L (all frames)
VideoYouTube-VOS 2018F-Measure (Unseen)87.3R50-AOTv2-L (all frames)
VideoYouTube-VOS 2018Jaccard (Seen)85.1R50-AOTv2-L (all frames)
VideoYouTube-VOS 2018Jaccard (Unseen)78.9R50-AOTv2-L (all frames)
VideoYouTube-VOS 2018Overall85.4R50-AOTv2-L (all frames)
VideoYouTube-VOS 2018Params(M)15.1R50-AOTv2-L (all frames)
VideoYouTube-VOS 2018Speed (FPS)6.3R50-AOTv2-L (all frames)
VideoYouTube-VOS 2018F-Measure (Seen)88.8R50-AOST (L'=3)
VideoYouTube-VOS 2018F-Measure (Unseen)87.9R50-AOST (L'=3)
VideoYouTube-VOS 2018Jaccard (Seen)83.8R50-AOST (L'=3)
VideoYouTube-VOS 2018Jaccard (Unseen)79.3R50-AOST (L'=3)
VideoYouTube-VOS 2018Overall85R50-AOST (L'=3)
VideoYouTube-VOS 2018Params(M)15.4R50-AOST (L'=3)
VideoYouTube-VOS 2018Speed (FPS)14.9R50-AOST (L'=3)
VideoYouTube-VOS 2018F-Measure (Seen)88.5R50-AOST (L'=2)
VideoYouTube-VOS 2018F-Measure (Unseen)87.2R50-AOST (L'=2)
VideoYouTube-VOS 2018Jaccard (Seen)83.5R50-AOST (L'=2)
VideoYouTube-VOS 2018Jaccard (Unseen)78.8R50-AOST (L'=2)
VideoYouTube-VOS 2018Overall84.5R50-AOST (L'=2)
VideoYouTube-VOS 2018Params(M)13.9R50-AOST (L'=2)
VideoYouTube-VOS 2018Speed (FPS)20.2R50-AOST (L'=2)
VideoYouTube-VOS 2018F-Measure (Seen)86.1R50-AOST (L'=1)
VideoYouTube-VOS 2018F-Measure (Unseen)83.5R50-AOST (L'=1)
VideoYouTube-VOS 2018Jaccard (Seen)81.4R50-AOST (L'=1)
VideoYouTube-VOS 2018Jaccard (Unseen)75.5R50-AOST (L'=1)
VideoYouTube-VOS 2018Overall81.6R50-AOST (L'=1)
VideoYouTube-VOS 2018Params(M)12.5R50-AOST (L'=1)
VideoYouTube-VOS 2018Speed (FPS)30.9R50-AOST (L'=1)
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)89.8SwinB-AOTv2-L (MS)
Video Object SegmentationDAVIS 2017 (val)J&F87SwinB-AOTv2-L (MS)
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)84.2SwinB-AOTv2-L (MS)
Video Object SegmentationDAVIS 2017 (val)Params(M)65.6SwinB-AOTv2-L (MS)
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)1.3SwinB-AOTv2-L (MS)
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)89.5SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2017 (val)J&F86.7SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)83.8SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2017 (val)Params(M)65.6SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)1.3SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)89.4SwinB-AOTv2-L
Video Object SegmentationDAVIS 2017 (val)J&F86.3SwinB-AOTv2-L
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)83.1SwinB-AOTv2-L
Video Object SegmentationDAVIS 2017 (val)Params(M)65.6SwinB-AOTv2-L
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)12SwinB-AOTv2-L
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)88.5R50-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (val)J&F85.6R50-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)82.6R50-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (val)Params(M)15.4R50-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)17.5R50-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)88R50-AOST (L'=2)
Video Object SegmentationDAVIS 2017 (val)J&F85.3R50-AOST (L'=2)
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)82.5R50-AOST (L'=2)
Video Object SegmentationDAVIS 2017 (val)Params(M)13.9R50-AOST (L'=2)
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)24.3R50-AOST (L'=2)
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)86.1R50-AOST (L'=1)
Video Object SegmentationDAVIS 2017 (val)J&F83.7R50-AOST (L'=1)
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)81.2R50-AOST (L'=1)
Video Object SegmentationDAVIS 2017 (val)Params(M)12.5R50-AOST (L'=1)
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)37.4R50-AOST (L'=1)
Video Object SegmentationDAVIS 2016F-measure (Mean)94.4SwinB-AOTv2-L (MS)
Video Object SegmentationDAVIS 2016J&F93SwinB-AOTv2-L (MS)
Video Object SegmentationDAVIS 2016Jaccard (Mean)91.6SwinB-AOTv2-L (MS)
Video Object SegmentationDAVIS 2016Speed (FPS)1.3SwinB-AOTv2-L (MS)
Video Object SegmentationDAVIS 2016F-measure (Mean)94.5SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2016J&F93SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2016Jaccard (Mean)91.5SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2016Speed (FPS)1.3SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2016F-measure (Mean)94.1SwinB-AOTv2-L
Video Object SegmentationDAVIS 2016J&F92.4SwinB-AOTv2-L
Video Object SegmentationDAVIS 2016Jaccard (Mean)90.6SwinB-AOTv2-L
Video Object SegmentationDAVIS 2016Speed (FPS)12SwinB-AOTv2-L
Video Object SegmentationDAVIS 2016F-measure (Mean)94.2SwinB-AOST (L'=3)
Video Object SegmentationDAVIS 2016J&F92.4SwinB-AOST (L'=3)
Video Object SegmentationDAVIS 2016Jaccard (Mean)90.5SwinB-AOST (L'=3)
Video Object SegmentationDAVIS 2016Speed (FPS)12SwinB-AOST (L'=3)
Video Object SegmentationDAVIS 2016F-measure (Mean)93.6R50-AOST (L'=3)
Video Object SegmentationDAVIS 2016J&F92.1R50-AOST (L'=3)
Video Object SegmentationDAVIS 2016Jaccard (Mean)90.6R50-AOST (L'=3)
Video Object SegmentationDAVIS 2016Speed (FPS)17.5R50-AOST (L'=3)
Video Object SegmentationDAVIS 2016F-measure (Mean)93.4R50-AOST (L'=2)
Video Object SegmentationDAVIS 2016J&F92R50-AOST (L'=2)
Video Object SegmentationDAVIS 2016Jaccard (Mean)90.5R50-AOST (L'=2)
Video Object SegmentationDAVIS 2016Speed (FPS)24.3R50-AOST (L'=2)
Video Object SegmentationDAVIS 2016F-measure (Mean)90.9R50-AOST (L'=1)
Video Object SegmentationDAVIS 2016J&F90.3R50-AOST (L'=1)
Video Object SegmentationDAVIS 2016Jaccard (Mean)89.6R50-AOST (L'=1)
Video Object SegmentationDAVIS 2016Speed (FPS)37.4R50-AOST (L'=1)
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)90.3SwinB-AOTv2-L (all frames, MS)
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)89.1SwinB-AOTv2-L (all frames, MS)
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)85.5SwinB-AOTv2-L (all frames, MS)
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)81SwinB-AOTv2-L (all frames, MS)
Video Object SegmentationYouTube-VOS 2019Overall86.5SwinB-AOTv2-L (all frames, MS)
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)88.9SwinB-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)88SwinB-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)84.2SwinB-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)79.8SwinB-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2019Overall85.2SwinB-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)88.7R50-AOST (L'=3)
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)87.7R50-AOST (L'=3)
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)83.8R50-AOST (L'=3)
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)79.3R50-AOST (L'=3)
Video Object SegmentationYouTube-VOS 2019Overall84.9R50-AOST (L'=3)
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)88R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)87.1R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)83.3R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)78.9R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2019Overall84.3R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)85.6R50-AOST (L'=1)
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)83.8R50-AOST (L'=1)
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)81R50-AOST (L'=1)
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)754.8R50-AOST (L'=1)
Video Object SegmentationYouTube-VOS 2019Overall81.5R50-AOST (L'=1)
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)88.5SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2017 (test-dev)FPS1.3SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2017 (test-dev)J&F84.7SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)80.9SwinB-AOST (L'=3, MS)
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)87.9SwinB-AOTv2-L
Video Object SegmentationDAVIS 2017 (test-dev)FPS1.3SwinB-AOTv2-L
Video Object SegmentationDAVIS 2017 (test-dev)J&F84.5SwinB-AOTv2-L
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)81SwinB-AOTv2-L
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)86.6SwinB-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (test-dev)FPS12SwinB-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (test-dev)J&F82.7SwinB-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)78.8SwinB-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)83.6R50-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (test-dev)FPS17.5R50-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (test-dev)J&F79.9R50-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)76.2R50-AOST (L'=3)
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)81.7R50-AOST (L'=2)
Video Object SegmentationDAVIS 2017 (test-dev)FPS24.3R50-AOST (L'=2)
Video Object SegmentationDAVIS 2017 (test-dev)J&F78.1R50-AOST (L'=2)
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)74.5R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)90.7SwinB-AOTv2-L (all frames, MS)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)88.9SwinB-AOTv2-L (all frames, MS)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.6SwinB-AOTv2-L (all frames, MS)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)80.7SwinB-AOTv2-L (all frames, MS)
Video Object SegmentationYouTube-VOS 2018Overall86.5SwinB-AOTv2-L (all frames, MS)
Video Object SegmentationYouTube-VOS 2018Params(M)65.6SwinB-AOTv2-L (all frames, MS)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)0.7SwinB-AOTv2-L (all frames, MS)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)90.1SwinB-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)88.2SwinB-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)79.6SwinB-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2018Overall85.8SwinB-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)5.1SwinB-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)90.2R50-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)87.3R50-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.1R50-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78.9R50-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2018Overall85.4R50-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2018Params(M)15.1R50-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)6.3R50-AOTv2-L (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)88.8R50-AOST (L'=3)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)87.9R50-AOST (L'=3)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)83.8R50-AOST (L'=3)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)79.3R50-AOST (L'=3)
Video Object SegmentationYouTube-VOS 2018Overall85R50-AOST (L'=3)
Video Object SegmentationYouTube-VOS 2018Params(M)15.4R50-AOST (L'=3)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)14.9R50-AOST (L'=3)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)88.5R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)87.2R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)83.5R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78.8R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2018Overall84.5R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2018Params(M)13.9R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)20.2R50-AOST (L'=2)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)86.1R50-AOST (L'=1)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)83.5R50-AOST (L'=1)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)81.4R50-AOST (L'=1)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)75.5R50-AOST (L'=1)
Video Object SegmentationYouTube-VOS 2018Overall81.6R50-AOST (L'=1)
Video Object SegmentationYouTube-VOS 2018Params(M)12.5R50-AOST (L'=1)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)30.9R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)89.8SwinB-AOTv2-L (MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F87SwinB-AOTv2-L (MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)84.2SwinB-AOTv2-L (MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Params(M)65.6SwinB-AOTv2-L (MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)1.3SwinB-AOTv2-L (MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)89.5SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F86.7SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)83.8SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Params(M)65.6SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)1.3SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)89.4SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F86.3SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)83.1SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Params(M)65.6SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)12SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)88.5R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F85.6R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)82.6R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Params(M)15.4R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)17.5R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)88R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F85.3R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)82.5R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Params(M)13.9R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)24.3R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)86.1R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F83.7R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)81.2R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Params(M)12.5R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)37.4R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)94.4SwinB-AOTv2-L (MS)
Semi-Supervised Video Object SegmentationDAVIS 2016J&F93SwinB-AOTv2-L (MS)
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)91.6SwinB-AOTv2-L (MS)
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)1.3SwinB-AOTv2-L (MS)
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)94.5SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2016J&F93SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)91.5SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)1.3SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)94.1SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2016J&F92.4SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)90.6SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)12SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)94.2SwinB-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2016J&F92.4SwinB-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)90.5SwinB-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)12SwinB-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)93.6R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2016J&F92.1R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)90.6R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)17.5R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)93.4R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationDAVIS 2016J&F92R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)90.5R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)24.3R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)90.9R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationDAVIS 2016J&F90.3R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)89.6R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)37.4R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)90.3SwinB-AOTv2-L (all frames, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)89.1SwinB-AOTv2-L (all frames, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)85.5SwinB-AOTv2-L (all frames, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)81SwinB-AOTv2-L (all frames, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Overall86.5SwinB-AOTv2-L (all frames, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)88.9SwinB-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)88SwinB-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)84.2SwinB-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)79.8SwinB-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Overall85.2SwinB-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)88.7R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)87.7R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)83.8R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)79.3R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Overall84.9R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)88R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)87.1R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)83.3R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)78.9R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Overall84.3R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)85.6R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)83.8R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)81R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)754.8R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationYouTube-VOS 2019Overall81.5R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)88.5SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)FPS1.3SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F84.7SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)80.9SwinB-AOST (L'=3, MS)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)87.9SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)FPS1.3SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F84.5SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)81SwinB-AOTv2-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)86.6SwinB-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)FPS12SwinB-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F82.7SwinB-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)78.8SwinB-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)83.6R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)FPS17.5R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F79.9R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)76.2R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)81.7R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)FPS24.3R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F78.1R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)74.5R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)90.7SwinB-AOTv2-L (all frames, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)88.9SwinB-AOTv2-L (all frames, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.6SwinB-AOTv2-L (all frames, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)80.7SwinB-AOTv2-L (all frames, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall86.5SwinB-AOTv2-L (all frames, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)65.6SwinB-AOTv2-L (all frames, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)0.7SwinB-AOTv2-L (all frames, MS)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)90.1SwinB-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)88.2SwinB-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)79.6SwinB-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall85.8SwinB-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)5.1SwinB-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)90.2R50-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)87.3R50-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.1R50-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78.9R50-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall85.4R50-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)15.1R50-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)6.3R50-AOTv2-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)88.8R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)87.9R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)83.8R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)79.3R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall85R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)15.4R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)14.9R50-AOST (L'=3)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)88.5R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)87.2R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)83.5R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78.8R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall84.5R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)13.9R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)20.2R50-AOST (L'=2)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)86.1R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)83.5R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)81.4R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)75.5R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall81.6R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)12.5R50-AOST (L'=1)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)30.9R50-AOST (L'=1)

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21Deep Learning-Based Fetal Lung Segmentation from Diffusion-weighted MRI Images and Lung Maturity Evaluation for Fetal Growth Restriction2025-07-17DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17From Variability To Accuracy: Conditional Bernoulli Diffusion Models with Consensus-Driven Correction for Thin Structure Segmentation2025-07-17Unleashing Vision Foundation Models for Coronary Artery Segmentation: Parallel ViT-CNN Encoding and Variational Fusion2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17