TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Associating Objects with Transformers for Video Object Seg...

Associating Objects with Transformers for Video Object Segmentation

Zongxin Yang, Yunchao Wei, Yi Yang

2021-06-04NeurIPS 2021 12Visual Object TrackingSemi-Supervised Video Object SegmentationOne-shot visual object segmentationSemantic SegmentationVideo Object SegmentationVideo Semantic Segmentation
PaperPDFCodeCode(official)

Abstract

This paper investigates how to realize better and more efficient embedding learning to tackle the semi-supervised video object segmentation under challenging multi-object scenarios. The state-of-the-art methods learn to decode features with a single positive object and thus have to match and segment each target separately under multi-object scenarios, consuming multiple times computing resources. To solve the problem, we propose an Associating Objects with Transformers (AOT) approach to match and decode multiple objects uniformly. In detail, AOT employs an identification mechanism to associate multiple targets into the same high-dimensional embedding space. Thus, we can simultaneously process multiple objects' matching and segmentation decoding as efficiently as processing a single object. For sufficiently modeling multi-object association, a Long Short-Term Transformer is designed for constructing hierarchical matching and propagation. We conduct extensive experiments on both multi-object and single-object benchmarks to examine AOT variant networks with different complexities. Particularly, our R50-AOT-L outperforms all the state-of-the-art competitors on three popular benchmarks, i.e., YouTube-VOS (84.1% J&F), DAVIS 2017 (84.9%), and DAVIS 2016 (91.1%), while keeping more than $3\times$ faster multi-object run-time. Meanwhile, our AOT-T can maintain real-time multi-object speed on the above benchmarks. Based on AOT, we ranked 1st in the 3rd Large-scale VOS Challenge.

Results

TaskDatasetMetricValueModel
VideoYouTube-VOS 2019F-Measure (Seen)88.1AOT
VideoYouTube-VOS 2019F-Measure (Unseen)86.3AOT
VideoYouTube-VOS 2019Jaccard (Seen)83.5AOT
VideoYouTube-VOS 2019Jaccard (Unseen)78.4AOT
VideoDAVIS 2017 (test-dev)F-measure83.3AOT
VideoDAVIS 2017 (test-dev)Jaccard75.9AOT
VideoDAVIS 2017 (test-dev)Mean Jaccard & F-Measure79.6AOT
VideoMOSEF61.3AOT
VideoMOSEJ53.1AOT
VideoMOSEJ&F57.2AOT
VideoDAVIS 2017 (val)F-measure (Mean)88.4SwinB-AOT-L
VideoDAVIS 2017 (val)J&F85.4SwinB-AOT-L
VideoDAVIS 2017 (val)Jaccard (Mean)82.4SwinB-AOT-L
VideoDAVIS 2017 (val)Params(M)65.4SwinB-AOT-L
VideoDAVIS 2017 (val)Speed (FPS)12.1SwinB-AOT-L
VideoDAVIS 2017 (val)F-measure (Mean)87.5R50-AOT-L
VideoDAVIS 2017 (val)J&F84.9R50-AOT-L
VideoDAVIS 2017 (val)Jaccard (Mean)82.3R50-AOT-L
VideoDAVIS 2017 (val)Params(M)14.9R50-AOT-L
VideoDAVIS 2017 (val)Speed (FPS)18R50-AOT-L
VideoDAVIS 2017 (val)F-measure (Mean)86.4AOT-L
VideoDAVIS 2017 (val)J&F83.8AOT-L
VideoDAVIS 2017 (val)Jaccard (Mean)81.1AOT-L
VideoDAVIS 2017 (val)Params(M)8.3AOT-L
VideoDAVIS 2017 (val)Speed (FPS)18.7AOT-L
VideoDAVIS 2017 (val)F-measure (Mean)85.2AOT-B
VideoDAVIS 2017 (val)J&F82.5AOT-B
VideoDAVIS 2017 (val)Jaccard (Mean)79.7AOT-B
VideoDAVIS 2017 (val)Params(M)8.3AOT-B
VideoDAVIS 2017 (val)Speed (FPS)29.6AOT-B
VideoDAVIS 2017 (val)F-measure (Mean)83.9AOT-S
VideoDAVIS 2017 (val)J&F81.3AOT-S
VideoDAVIS 2017 (val)Jaccard (Mean)78.7AOT-S
VideoDAVIS 2017 (val)Params(M)7AOT-S
VideoDAVIS 2017 (val)Speed (FPS)40AOT-S
VideoDAVIS 2017 (val)F-measure (Mean)82.3AOT-T
VideoDAVIS 2017 (val)J&F79.9AOT-T
VideoDAVIS 2017 (val)Jaccard (Mean)77.4AOT-T
VideoDAVIS 2017 (val)Params(M)5.7AOT-T
VideoDAVIS 2017 (val)Speed (FPS)51.4AOT-T
VideoVOT2020EAO0.586SwinB-AOT-L
VideoVOT2020EAO (real-time)0.523SwinB-AOT-L
VideoVOT2020EAO0.574AOT-L
VideoVOT2020EAO (real-time)0.56AOT-L
VideoVOT2020EAO0.569R50-AOT-L
VideoVOT2020EAO (real-time)0.54R50-AOT-L
VideoVOT2020EAO0.541AOT-B
VideoVOT2020EAO (real-time)0.533AOT-B
VideoVOT2020EAO0.512AOT-S
VideoVOT2020EAO (real-time)0.499AOT-S
VideoVOT2020EAO0.435AOT-T
VideoVOT2020EAO (real-time)0.433AOT-T
VideoDAVIS 2016F-measure (Mean)93.3SwinB-AOT-L
VideoDAVIS 2016J&F92SwinB-AOT-L
VideoDAVIS 2016Jaccard (Mean)90.7SwinB-AOT-L
VideoDAVIS 2016Speed (FPS)12.1SwinB-AOT-L
VideoDAVIS 2016F-measure (Mean)92.1R50-AOT-L
VideoDAVIS 2016J&F91.1R50-AOT-L
VideoDAVIS 2016Jaccard (Mean)90.1R50-AOT-L
VideoDAVIS 2016Speed (FPS)18R50-AOT-L
VideoDAVIS 2016F-measure (Mean)91.1AOT-L
VideoDAVIS 2016J&F90.4AOT-L
VideoDAVIS 2016Jaccard (Mean)89.6AOT-L
VideoDAVIS 2016Speed (FPS)18.7AOT-L
VideoDAVIS 2016F-measure (Mean)91.1AOT-L
VideoDAVIS 2016J&F89.9AOT-L
VideoDAVIS 2016Jaccard (Mean)88.7AOT-L
VideoDAVIS 2016Speed (FPS)29.6AOT-L
VideoDAVIS 2016F-measure (Mean)90.2AOT-S
VideoDAVIS 2016J&F89.4AOT-S
VideoDAVIS 2016Jaccard (Mean)88.6AOT-S
VideoDAVIS 2016Speed (FPS)40AOT-S
VideoDAVIS 2016F-measure (Mean)87.4AOT-T
VideoDAVIS 2016J&F86.8AOT-T
VideoDAVIS 2016Jaccard (Mean)86.1AOT-T
VideoDAVIS 2016Speed (FPS)51.4AOT-T
VideoDAVIS 2017 (test-dev)F-measure (Mean)85.1SwinB-AOT-L
VideoDAVIS 2017 (test-dev)FPS12.1SwinB-AOT-L
VideoDAVIS 2017 (test-dev)J&F81.2SwinB-AOT-L
VideoDAVIS 2017 (test-dev)Jaccard (Mean)77.3SwinB-AOT-L
VideoDAVIS 2017 (test-dev)F-measure (Mean)83.3R50-AOT-L
VideoDAVIS 2017 (test-dev)FPS18R50-AOT-L
VideoDAVIS 2017 (test-dev)J&F79.6R50-AOT-L
VideoDAVIS 2017 (test-dev)Jaccard (Mean)75.9R50-AOT-L
VideoDAVIS 2017 (test-dev)F-measure (Mean)82.3AOT-L
VideoDAVIS 2017 (test-dev)FPS18.7AOT-L
VideoDAVIS 2017 (test-dev)J&F78.3AOT-L
VideoDAVIS 2017 (test-dev)Jaccard (Mean)74.3AOT-L
VideoDAVIS 2017 (test-dev)F-measure (Mean)79.3AOT-B
VideoDAVIS 2017 (test-dev)FPS29.6AOT-B
VideoDAVIS 2017 (test-dev)J&F75.5AOT-B
VideoDAVIS 2017 (test-dev)Jaccard (Mean)71.6AOT-B
VideoDAVIS 2017 (test-dev)F-measure (Mean)77.5AOT-S
VideoDAVIS 2017 (test-dev)FPS40AOT-S
VideoDAVIS 2017 (test-dev)J&F73.9AOT-S
VideoDAVIS 2017 (test-dev)Jaccard (Mean)70.3AOT-S
VideoDAVIS 2017 (test-dev)F-measure (Mean)75.7AOT-T
VideoDAVIS 2017 (test-dev)FPS51.4AOT-T
VideoDAVIS 2017 (test-dev)J&F72AOT-T
VideoDAVIS 2017 (test-dev)Jaccard (Mean)68.3AOT-T
VideoDAVIS (no YouTube-VOS training)D17 val (F)82AOT-S
VideoDAVIS (no YouTube-VOS training)D17 val (G)79.2AOT-S
VideoDAVIS (no YouTube-VOS training)D17 val (J)76.4AOT-S
VideoDAVIS (no YouTube-VOS training)FPS40AOT-S
VideoYouTube-VOS 2018F-Measure (Seen)89.5R50-AOT-L (all frames)
VideoYouTube-VOS 2018F-Measure (Unseen)88.2R50-AOT-L (all frames)
VideoYouTube-VOS 2018Jaccard (Seen)84.5R50-AOT-L (all frames)
VideoYouTube-VOS 2018Jaccard (Unseen)79.6R50-AOT-L (all frames)
VideoYouTube-VOS 2018Overall85.5R50-AOT-L (all frames)
VideoYouTube-VOS 2018Params(M)14.9R50-AOT-L (all frames)
VideoYouTube-VOS 2018Speed (FPS)6.4R50-AOT-L (all frames)
VideoYouTube-VOS 2018F-Measure (Seen)90.1SwinB-AOT-L (all frames)
VideoYouTube-VOS 2018F-Measure (Unseen)86.9SwinB-AOT-L (all frames)
VideoYouTube-VOS 2018Jaccard (Seen)85.1SwinB-AOT-L (all frames)
VideoYouTube-VOS 2018Jaccard (Unseen)78.4SwinB-AOT-L (all frames)
VideoYouTube-VOS 2018Overall85.1SwinB-AOT-L (all frames)
VideoYouTube-VOS 2018Params(M)65.4SwinB-AOT-L (all frames)
VideoYouTube-VOS 2018Speed (FPS)5.2SwinB-AOT-L (all frames)
VideoYouTube-VOS 2018F-Measure (Seen)89.3SwinB-AOT-L
VideoYouTube-VOS 2018F-Measure (Unseen)86.4SwinB-AOT-L
VideoYouTube-VOS 2018Jaccard (Seen)84.3SwinB-AOT-L
VideoYouTube-VOS 2018Jaccard (Unseen)77.9SwinB-AOT-L
VideoYouTube-VOS 2018Overall84.5SwinB-AOT-L
VideoYouTube-VOS 2018Params(M)65.4SwinB-AOT-L
VideoYouTube-VOS 2018Speed (FPS)9.3SwinB-AOT-L
VideoYouTube-VOS 2018F-Measure (Seen)88.8AOT-L (all frames)
VideoYouTube-VOS 2018F-Measure (Unseen)87.1AOT-L (all frames)
VideoYouTube-VOS 2018Jaccard (Seen)83.7AOT-L (all frames)
VideoYouTube-VOS 2018Jaccard (Unseen)78.4AOT-L (all frames)
VideoYouTube-VOS 2018Overall84.5AOT-L (all frames)
VideoYouTube-VOS 2018Params(M)8.3AOT-L (all frames)
VideoYouTube-VOS 2018Speed (FPS)6.5AOT-L (all frames)
VideoYouTube-VOS 2018F-Measure (Seen)88.5R50-AOT-L
VideoYouTube-VOS 2018F-Measure (Unseen)86.1R50-AOT-L
VideoYouTube-VOS 2018Jaccard (Seen)83.7R50-AOT-L
VideoYouTube-VOS 2018Jaccard (Unseen)78.1R50-AOT-L
VideoYouTube-VOS 2018Overall84.1R50-AOT-L
VideoYouTube-VOS 2018Params(M)14.9R50-AOT-L
VideoYouTube-VOS 2018Speed (FPS)14.9R50-AOT-L
VideoYouTube-VOS 2018F-Measure (Seen)88.5AOT-B (all frames)
VideoYouTube-VOS 2018F-Measure (Unseen)86.5AOT-B (all frames)
VideoYouTube-VOS 2018Jaccard (Seen)83.6AOT-B (all frames)
VideoYouTube-VOS 2018Jaccard (Unseen)78AOT-B (all frames)
VideoYouTube-VOS 2018Overall84.1AOT-B (all frames)
VideoYouTube-VOS 2018Params(M)8.3AOT-B (all frames)
VideoYouTube-VOS 2018Speed (FPS)20.5AOT-B (all frames)
VideoYouTube-VOS 2018F-Measure (Seen)87.9AOT-L
VideoYouTube-VOS 2018F-Measure (Unseen)86.5AOT-L
VideoYouTube-VOS 2018Jaccard (Seen)82.9AOT-L
VideoYouTube-VOS 2018Jaccard (Unseen)77.7AOT-L
VideoYouTube-VOS 2018Overall83.8AOT-L
VideoYouTube-VOS 2018Params(M)8.3AOT-L
VideoYouTube-VOS 2018Speed (FPS)16AOT-L
VideoYouTube-VOS 2018F-Measure (Seen)87.5AOT-B
VideoYouTube-VOS 2018F-Measure (Unseen)86AOT-B
VideoYouTube-VOS 2018Jaccard (Seen)82.6AOT-B
VideoYouTube-VOS 2018Jaccard (Unseen)77.7AOT-B
VideoYouTube-VOS 2018Overall83.5AOT-B
VideoYouTube-VOS 2018Params(M)8.3AOT-B
VideoYouTube-VOS 2018Speed (FPS)20.5AOT-B
VideoYouTube-VOS 2018F-Measure (Seen)87AOT-S (all frames)
VideoYouTube-VOS 2018F-Measure (Unseen)85.7AOT-S (all frames)
VideoYouTube-VOS 2018Jaccard (Seen)82.2AOT-S (all frames)
VideoYouTube-VOS 2018Jaccard (Unseen)77.3AOT-S (all frames)
VideoYouTube-VOS 2018Overall83AOT-S (all frames)
VideoYouTube-VOS 2018Params(M)7.9AOT-S (all frames)
VideoYouTube-VOS 2018Speed (FPS)27.1AOT-S (all frames)
VideoYouTube-VOS 2018F-Measure (Seen)86.7AOT-S
VideoYouTube-VOS 2018F-Measure (Unseen)85AOT-S
VideoYouTube-VOS 2018Jaccard (Seen)82AOT-S
VideoYouTube-VOS 2018Jaccard (Unseen)76.6AOT-S
VideoYouTube-VOS 2018Overall82.6AOT-S
VideoYouTube-VOS 2018Params(M)7.9AOT-S
VideoYouTube-VOS 2018Speed (FPS)27.1AOT-S
VideoYouTube-VOS 2018F-Measure (Seen)84.7AOT-T (all frames)
VideoYouTube-VOS 2018F-Measure (Unseen)83.5AOT-T (all frames)
VideoYouTube-VOS 2018Jaccard (Seen)80AOT-T (all frames)
VideoYouTube-VOS 2018Jaccard (Unseen)75.2AOT-T (all frames)
VideoYouTube-VOS 2018Overall80.9AOT-T (all frames)
VideoYouTube-VOS 2018Params(M)5.3AOT-T (all frames)
VideoYouTube-VOS 2018Speed (FPS)41AOT-T (all frames)
VideoYouTube-VOS 2018F-Measure (Seen)84.5AOT-T
VideoYouTube-VOS 2018F-Measure (Unseen)82.2AOT-T
VideoYouTube-VOS 2018Jaccard (Seen)80.1AOT-T
VideoYouTube-VOS 2018Jaccard (Unseen)74AOT-T
VideoYouTube-VOS 2018Overall80.2AOT-T
VideoYouTube-VOS 2018Params(M)5.3AOT-T
VideoYouTube-VOS 2018Speed (FPS)41AOT-T
Object TrackingVOT2022EAO0.673MS_AOT
Video Object SegmentationYouTube-VOS 2019F-Measure (Seen)88.1AOT
Video Object SegmentationYouTube-VOS 2019F-Measure (Unseen)86.3AOT
Video Object SegmentationYouTube-VOS 2019Jaccard (Seen)83.5AOT
Video Object SegmentationYouTube-VOS 2019Jaccard (Unseen)78.4AOT
Video Object SegmentationDAVIS 2017 (test-dev)F-measure83.3AOT
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard75.9AOT
Video Object SegmentationDAVIS 2017 (test-dev)Mean Jaccard & F-Measure79.6AOT
Video Object SegmentationMOSEF61.3AOT
Video Object SegmentationMOSEJ53.1AOT
Video Object SegmentationMOSEJ&F57.2AOT
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)88.4SwinB-AOT-L
Video Object SegmentationDAVIS 2017 (val)J&F85.4SwinB-AOT-L
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)82.4SwinB-AOT-L
Video Object SegmentationDAVIS 2017 (val)Params(M)65.4SwinB-AOT-L
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)12.1SwinB-AOT-L
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)87.5R50-AOT-L
Video Object SegmentationDAVIS 2017 (val)J&F84.9R50-AOT-L
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)82.3R50-AOT-L
Video Object SegmentationDAVIS 2017 (val)Params(M)14.9R50-AOT-L
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)18R50-AOT-L
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)86.4AOT-L
Video Object SegmentationDAVIS 2017 (val)J&F83.8AOT-L
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)81.1AOT-L
Video Object SegmentationDAVIS 2017 (val)Params(M)8.3AOT-L
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)18.7AOT-L
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)85.2AOT-B
Video Object SegmentationDAVIS 2017 (val)J&F82.5AOT-B
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)79.7AOT-B
Video Object SegmentationDAVIS 2017 (val)Params(M)8.3AOT-B
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)29.6AOT-B
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)83.9AOT-S
Video Object SegmentationDAVIS 2017 (val)J&F81.3AOT-S
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)78.7AOT-S
Video Object SegmentationDAVIS 2017 (val)Params(M)7AOT-S
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)40AOT-S
Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)82.3AOT-T
Video Object SegmentationDAVIS 2017 (val)J&F79.9AOT-T
Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)77.4AOT-T
Video Object SegmentationDAVIS 2017 (val)Params(M)5.7AOT-T
Video Object SegmentationDAVIS 2017 (val)Speed (FPS)51.4AOT-T
Video Object SegmentationVOT2020EAO0.586SwinB-AOT-L
Video Object SegmentationVOT2020EAO (real-time)0.523SwinB-AOT-L
Video Object SegmentationVOT2020EAO0.574AOT-L
Video Object SegmentationVOT2020EAO (real-time)0.56AOT-L
Video Object SegmentationVOT2020EAO0.569R50-AOT-L
Video Object SegmentationVOT2020EAO (real-time)0.54R50-AOT-L
Video Object SegmentationVOT2020EAO0.541AOT-B
Video Object SegmentationVOT2020EAO (real-time)0.533AOT-B
Video Object SegmentationVOT2020EAO0.512AOT-S
Video Object SegmentationVOT2020EAO (real-time)0.499AOT-S
Video Object SegmentationVOT2020EAO0.435AOT-T
Video Object SegmentationVOT2020EAO (real-time)0.433AOT-T
Video Object SegmentationDAVIS 2016F-measure (Mean)93.3SwinB-AOT-L
Video Object SegmentationDAVIS 2016J&F92SwinB-AOT-L
Video Object SegmentationDAVIS 2016Jaccard (Mean)90.7SwinB-AOT-L
Video Object SegmentationDAVIS 2016Speed (FPS)12.1SwinB-AOT-L
Video Object SegmentationDAVIS 2016F-measure (Mean)92.1R50-AOT-L
Video Object SegmentationDAVIS 2016J&F91.1R50-AOT-L
Video Object SegmentationDAVIS 2016Jaccard (Mean)90.1R50-AOT-L
Video Object SegmentationDAVIS 2016Speed (FPS)18R50-AOT-L
Video Object SegmentationDAVIS 2016F-measure (Mean)91.1AOT-L
Video Object SegmentationDAVIS 2016J&F90.4AOT-L
Video Object SegmentationDAVIS 2016Jaccard (Mean)89.6AOT-L
Video Object SegmentationDAVIS 2016Speed (FPS)18.7AOT-L
Video Object SegmentationDAVIS 2016F-measure (Mean)91.1AOT-L
Video Object SegmentationDAVIS 2016J&F89.9AOT-L
Video Object SegmentationDAVIS 2016Jaccard (Mean)88.7AOT-L
Video Object SegmentationDAVIS 2016Speed (FPS)29.6AOT-L
Video Object SegmentationDAVIS 2016F-measure (Mean)90.2AOT-S
Video Object SegmentationDAVIS 2016J&F89.4AOT-S
Video Object SegmentationDAVIS 2016Jaccard (Mean)88.6AOT-S
Video Object SegmentationDAVIS 2016Speed (FPS)40AOT-S
Video Object SegmentationDAVIS 2016F-measure (Mean)87.4AOT-T
Video Object SegmentationDAVIS 2016J&F86.8AOT-T
Video Object SegmentationDAVIS 2016Jaccard (Mean)86.1AOT-T
Video Object SegmentationDAVIS 2016Speed (FPS)51.4AOT-T
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)85.1SwinB-AOT-L
Video Object SegmentationDAVIS 2017 (test-dev)FPS12.1SwinB-AOT-L
Video Object SegmentationDAVIS 2017 (test-dev)J&F81.2SwinB-AOT-L
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)77.3SwinB-AOT-L
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)83.3R50-AOT-L
Video Object SegmentationDAVIS 2017 (test-dev)FPS18R50-AOT-L
Video Object SegmentationDAVIS 2017 (test-dev)J&F79.6R50-AOT-L
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)75.9R50-AOT-L
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)82.3AOT-L
Video Object SegmentationDAVIS 2017 (test-dev)FPS18.7AOT-L
Video Object SegmentationDAVIS 2017 (test-dev)J&F78.3AOT-L
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)74.3AOT-L
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)79.3AOT-B
Video Object SegmentationDAVIS 2017 (test-dev)FPS29.6AOT-B
Video Object SegmentationDAVIS 2017 (test-dev)J&F75.5AOT-B
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)71.6AOT-B
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)77.5AOT-S
Video Object SegmentationDAVIS 2017 (test-dev)FPS40AOT-S
Video Object SegmentationDAVIS 2017 (test-dev)J&F73.9AOT-S
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)70.3AOT-S
Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)75.7AOT-T
Video Object SegmentationDAVIS 2017 (test-dev)FPS51.4AOT-T
Video Object SegmentationDAVIS 2017 (test-dev)J&F72AOT-T
Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)68.3AOT-T
Video Object SegmentationDAVIS (no YouTube-VOS training)D17 val (F)82AOT-S
Video Object SegmentationDAVIS (no YouTube-VOS training)D17 val (G)79.2AOT-S
Video Object SegmentationDAVIS (no YouTube-VOS training)D17 val (J)76.4AOT-S
Video Object SegmentationDAVIS (no YouTube-VOS training)FPS40AOT-S
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)89.5R50-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)88.2R50-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)84.5R50-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)79.6R50-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Overall85.5R50-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Params(M)14.9R50-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)6.4R50-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)90.1SwinB-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)86.9SwinB-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.1SwinB-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78.4SwinB-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Overall85.1SwinB-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Params(M)65.4SwinB-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)5.2SwinB-AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)89.3SwinB-AOT-L
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)86.4SwinB-AOT-L
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)84.3SwinB-AOT-L
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)77.9SwinB-AOT-L
Video Object SegmentationYouTube-VOS 2018Overall84.5SwinB-AOT-L
Video Object SegmentationYouTube-VOS 2018Params(M)65.4SwinB-AOT-L
Video Object SegmentationYouTube-VOS 2018Speed (FPS)9.3SwinB-AOT-L
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)88.8AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)87.1AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)83.7AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78.4AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Overall84.5AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Params(M)8.3AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)6.5AOT-L (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)88.5R50-AOT-L
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)86.1R50-AOT-L
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)83.7R50-AOT-L
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78.1R50-AOT-L
Video Object SegmentationYouTube-VOS 2018Overall84.1R50-AOT-L
Video Object SegmentationYouTube-VOS 2018Params(M)14.9R50-AOT-L
Video Object SegmentationYouTube-VOS 2018Speed (FPS)14.9R50-AOT-L
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)88.5AOT-B (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)86.5AOT-B (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)83.6AOT-B (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78AOT-B (all frames)
Video Object SegmentationYouTube-VOS 2018Overall84.1AOT-B (all frames)
Video Object SegmentationYouTube-VOS 2018Params(M)8.3AOT-B (all frames)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)20.5AOT-B (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)87.9AOT-L
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)86.5AOT-L
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)82.9AOT-L
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)77.7AOT-L
Video Object SegmentationYouTube-VOS 2018Overall83.8AOT-L
Video Object SegmentationYouTube-VOS 2018Params(M)8.3AOT-L
Video Object SegmentationYouTube-VOS 2018Speed (FPS)16AOT-L
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)87.5AOT-B
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)86AOT-B
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)82.6AOT-B
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)77.7AOT-B
Video Object SegmentationYouTube-VOS 2018Overall83.5AOT-B
Video Object SegmentationYouTube-VOS 2018Params(M)8.3AOT-B
Video Object SegmentationYouTube-VOS 2018Speed (FPS)20.5AOT-B
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)87AOT-S (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)85.7AOT-S (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)82.2AOT-S (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)77.3AOT-S (all frames)
Video Object SegmentationYouTube-VOS 2018Overall83AOT-S (all frames)
Video Object SegmentationYouTube-VOS 2018Params(M)7.9AOT-S (all frames)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)27.1AOT-S (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)86.7AOT-S
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)85AOT-S
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)82AOT-S
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)76.6AOT-S
Video Object SegmentationYouTube-VOS 2018Overall82.6AOT-S
Video Object SegmentationYouTube-VOS 2018Params(M)7.9AOT-S
Video Object SegmentationYouTube-VOS 2018Speed (FPS)27.1AOT-S
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)84.7AOT-T (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)83.5AOT-T (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)80AOT-T (all frames)
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)75.2AOT-T (all frames)
Video Object SegmentationYouTube-VOS 2018Overall80.9AOT-T (all frames)
Video Object SegmentationYouTube-VOS 2018Params(M)5.3AOT-T (all frames)
Video Object SegmentationYouTube-VOS 2018Speed (FPS)41AOT-T (all frames)
Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)84.5AOT-T
Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)82.2AOT-T
Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)80.1AOT-T
Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)74AOT-T
Video Object SegmentationYouTube-VOS 2018Overall80.2AOT-T
Video Object SegmentationYouTube-VOS 2018Params(M)5.3AOT-T
Video Object SegmentationYouTube-VOS 2018Speed (FPS)41AOT-T
Semi-Supervised Video Object SegmentationMOSEF61.3AOT
Semi-Supervised Video Object SegmentationMOSEJ53.1AOT
Semi-Supervised Video Object SegmentationMOSEJ&F57.2AOT
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)88.4SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F85.4SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)82.4SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Params(M)65.4SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)12.1SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)87.5R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F84.9R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)82.3R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Params(M)14.9R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)18R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)86.4AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F83.8AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)81.1AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Params(M)8.3AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)18.7AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)85.2AOT-B
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F82.5AOT-B
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)79.7AOT-B
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Params(M)8.3AOT-B
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)29.6AOT-B
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)83.9AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F81.3AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)78.7AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Params(M)7AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)40AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)F-measure (Mean)82.3AOT-T
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)J&F79.9AOT-T
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Jaccard (Mean)77.4AOT-T
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Params(M)5.7AOT-T
Semi-Supervised Video Object SegmentationDAVIS 2017 (val)Speed (FPS)51.4AOT-T
Semi-Supervised Video Object SegmentationVOT2020EAO0.586SwinB-AOT-L
Semi-Supervised Video Object SegmentationVOT2020EAO (real-time)0.523SwinB-AOT-L
Semi-Supervised Video Object SegmentationVOT2020EAO0.574AOT-L
Semi-Supervised Video Object SegmentationVOT2020EAO (real-time)0.56AOT-L
Semi-Supervised Video Object SegmentationVOT2020EAO0.569R50-AOT-L
Semi-Supervised Video Object SegmentationVOT2020EAO (real-time)0.54R50-AOT-L
Semi-Supervised Video Object SegmentationVOT2020EAO0.541AOT-B
Semi-Supervised Video Object SegmentationVOT2020EAO (real-time)0.533AOT-B
Semi-Supervised Video Object SegmentationVOT2020EAO0.512AOT-S
Semi-Supervised Video Object SegmentationVOT2020EAO (real-time)0.499AOT-S
Semi-Supervised Video Object SegmentationVOT2020EAO0.435AOT-T
Semi-Supervised Video Object SegmentationVOT2020EAO (real-time)0.433AOT-T
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)93.3SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016J&F92SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)90.7SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)12.1SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)92.1R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016J&F91.1R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)90.1R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)18R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)91.1AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016J&F90.4AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)89.6AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)18.7AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)91.1AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016J&F89.9AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)88.7AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)29.6AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)90.2AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2016J&F89.4AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)88.6AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)40AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2016F-measure (Mean)87.4AOT-T
Semi-Supervised Video Object SegmentationDAVIS 2016J&F86.8AOT-T
Semi-Supervised Video Object SegmentationDAVIS 2016Jaccard (Mean)86.1AOT-T
Semi-Supervised Video Object SegmentationDAVIS 2016Speed (FPS)51.4AOT-T
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)85.1SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)FPS12.1SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F81.2SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)77.3SwinB-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)83.3R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)FPS18R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F79.6R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)75.9R50-AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)82.3AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)FPS18.7AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F78.3AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)74.3AOT-L
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)79.3AOT-B
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)FPS29.6AOT-B
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F75.5AOT-B
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)71.6AOT-B
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)77.5AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)FPS40AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F73.9AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)70.3AOT-S
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)F-measure (Mean)75.7AOT-T
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)FPS51.4AOT-T
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)J&F72AOT-T
Semi-Supervised Video Object SegmentationDAVIS 2017 (test-dev)Jaccard (Mean)68.3AOT-T
Semi-Supervised Video Object SegmentationDAVIS (no YouTube-VOS training)D17 val (F)82AOT-S
Semi-Supervised Video Object SegmentationDAVIS (no YouTube-VOS training)D17 val (G)79.2AOT-S
Semi-Supervised Video Object SegmentationDAVIS (no YouTube-VOS training)D17 val (J)76.4AOT-S
Semi-Supervised Video Object SegmentationDAVIS (no YouTube-VOS training)FPS40AOT-S
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)89.5R50-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)88.2R50-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)84.5R50-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)79.6R50-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall85.5R50-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)14.9R50-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)6.4R50-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)90.1SwinB-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)86.9SwinB-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)85.1SwinB-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78.4SwinB-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall85.1SwinB-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)65.4SwinB-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)5.2SwinB-AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)89.3SwinB-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)86.4SwinB-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)84.3SwinB-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)77.9SwinB-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall84.5SwinB-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)65.4SwinB-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)9.3SwinB-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)88.8AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)87.1AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)83.7AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78.4AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall84.5AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)8.3AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)6.5AOT-L (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)88.5R50-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)86.1R50-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)83.7R50-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78.1R50-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall84.1R50-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)14.9R50-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)14.9R50-AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)88.5AOT-B (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)86.5AOT-B (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)83.6AOT-B (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)78AOT-B (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall84.1AOT-B (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)8.3AOT-B (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)20.5AOT-B (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)87.9AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)86.5AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)82.9AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)77.7AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall83.8AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)8.3AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)16AOT-L
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)87.5AOT-B
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)86AOT-B
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)82.6AOT-B
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)77.7AOT-B
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall83.5AOT-B
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)8.3AOT-B
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)20.5AOT-B
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)87AOT-S (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)85.7AOT-S (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)82.2AOT-S (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)77.3AOT-S (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall83AOT-S (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)7.9AOT-S (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)27.1AOT-S (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)86.7AOT-S
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)85AOT-S
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)82AOT-S
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)76.6AOT-S
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall82.6AOT-S
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)7.9AOT-S
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)27.1AOT-S
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)84.7AOT-T (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)83.5AOT-T (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)80AOT-T (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)75.2AOT-T (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall80.9AOT-T (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)5.3AOT-T (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)41AOT-T (all frames)
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Seen)84.5AOT-T
Semi-Supervised Video Object SegmentationYouTube-VOS 2018F-Measure (Unseen)82.2AOT-T
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Seen)80.1AOT-T
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Jaccard (Unseen)74AOT-T
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Overall80.2AOT-T
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Params(M)5.3AOT-T
Semi-Supervised Video Object SegmentationYouTube-VOS 2018Speed (FPS)41AOT-T
Visual Object TrackingVOT2022EAO0.673MS_AOT

Related Papers

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction2025-07-21DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model2025-07-17SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation2025-07-17Unified Medical Image Segmentation with State Space Modeling Snake2025-07-17A Privacy-Preserving Semantic-Segmentation Method Using Domain-Adaptation Technique2025-07-17SAMST: A Transformer framework based on SAM pseudo label filtering for remote sensing semi-supervised semantic segmentation2025-07-16Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping2025-07-15U-RWKV: Lightweight medical image segmentation with direction-adaptive RWKV2025-07-15