ImageNet-VidVRD
VideosIntroduced 2017-10-23
ImageNet-VidVRD dataset contains 1,000 videos selected from ILVSRC2016-VID dataset based on whether the video contains clear visual relations. It is split into 800 training set and 200 test set, and covers common subject/objects of 35 categories and predicates of 132 categories. Ten people contributed to labeling the dataset, which includes object trajectory labeling and relation labeling. Since the ILVSRC2016-VID dataset has the object trajectory annotation for 30 categories already, we supplemented the annotations by labeling the remaining 5 categories. In order to save the labor of relation labeling, we labeled typical segments of the videos in the training set and the whole of the videos in the test set.
Benchmarks
2D Semantic Segmentation/Recall@1002D Semantic Segmentation/Recall@502D Semantic Segmentation/mAPScene Parsing/Recall@100Scene Parsing/Recall@50Scene Parsing/mAPScene Understanding/Recall@100Scene Understanding/Recall@50Scene Understanding/mAPVideo scene graph generation/Recall@50Visual Relationship Detection/Recall@100Visual Relationship Detection/Recall@50Visual Relationship Detection/mAP