VTC

Videos, Titles and Comments

AudioImagesTextsVideosLICENSE-CC-BY-NC-SAIntroduced 2022-10-19

VTC is a large-scale multimodal dataset containing video-caption pairs (~300k) alongside comments that can be used for multimodal representation learning.