Long-RVOS

TextsVideosIntroduced 2025-05-19

This work proposes Long-RVOS, a large-scale benchmark for long-term video object segmentation. Long-RVOS is the first minute-level dataset in the RVOS field, designed to tackle various realistic long-video challenges such as frequent occlusion, disappearance-reappearance, and shot changing. Notably, Long-RVOS offers significantly longer video duration than existing datasets. In addition, it contains the largest number of object classes and mask annotations. The large scale of Long-RVOS supports comprehensive training and evaluation of RVOS models. Finally, we gather 24,689 high-quality descriptions for building Long-RVOS.