PHD²
Personalized Highlight Detection Dataset
VideosIntroduced 2018-04-18
The dataset contains information on what video segments a specific user considers a highlight. Having this kind of data allows for strong personalization models, as specific examples of what a user is interested in help models obtain a fine-grained understanding of that specific user.
The data consists of YouTube videos, from which gifs.com users manually extracted their highlights, by creating GIFs from a segment of the full video. Thus, the dataset is similar to PHD-GIFS, with two major differences.
- Each selection is associated with a user, which is what allows personalization.
- instead of visual matching to find the position in the video from which a GIF was selected, PHD-GIFS uses the timestamps. Thus, the ground truth is free from any alignment errors.
The training set contains highlights from 12,972 users. The test set contains highlights from 850 users.