KnowIT VQA
TextsVideos
KnowIT VQA is a video dataset with 24,282 human-generated question-answer pairs about The Big Bang Theory. The dataset combines visual, textual and temporal coherence reasoning together with knowledge-based questions, which need of the experience obtained from the viewing of the series to be answered.
Source: KnowIT VQA: Answering Knowledge-Based Questions about Videos