VOST
VideosIntroduced 2022-12-12
VOST consists of more than 700 high-resolution videos, captured in diverse environments, which are 20 seconds long on average and densely labeled with instance masks. A careful, multi-step approach is adopted to ensure that these videos focus on complex transformations, capturing their full temporal extent.
Source: Breaking the “Object” in Video Object Segmentation
Image Source: https://www.vostdataset.org/