Youtube-VIS (trained with no video masks)