VOST

VideosIntroduced 2022-12-12

VOST consists of more than 700 high-resolution videos, captured in diverse environments, which are 20 seconds long on average and densely labeled with instance masks. A careful, multi-step approach is adopted to ensure that these videos focus on complex transformations, capturing their full temporal extent.

Source: Breaking the “Object” in Video Object Segmentation

Image Source: https://www.vostdataset.org/