ActivityNet-QA

TextsVideos

The ActivityNet-QA dataset contains 58,000 human-annotated QA pairs on 5,800 videos derived from the popular ActivityNet dataset. The dataset provides a benchmark for testing the performance of VideoQA models on long-term spatio-temporal reasoning.

Source: ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering