Streaming Video Understanding Benchmark
This dataset card aims to provide a comprehensive overview of the SVBench dataset, including its purpose, structure, and sources. For details, see our Project, Paper and GitHub repository.
SVBench is the first benchmark specifically designed to evaluate long-context streaming video understanding through temporal multi-turn question-answering (QA) chains. It addresses the limitations of existing video understanding benchmarks by emphasizing continuous temporal reasoning over streaming video data.
The dataset includes:
Languages
License
Dataset Sources
Download SVBench dataset from Hugging Face:
git clone https://huggingface.co/yzy666/SVBench
Intended Use
Direct Use
Restrictions
SVBench/
├── Con/
│ ├── Con_EN/
│ └── Con_ZH/
├── Dialogue/
│ ├── Dialogue_EN/
│ └── Dialogue_ZH/
├── Meta/
│ ├── Meta_EN/
│ └── Meta_ZH/
├── Path/
├── Src/
├── Streaming/
│ ├── Streaming_EN/
│ └── Streaming_ZH/
├── Video/
└── Your_Model_Name/
├── dialogue/
│ └── --NDulaHyrE.json
└── streaming/
└── -4h8cuweoKo.json
Dataset Division:
| Key | Description | | ------------------------- | ------------------------------------------------------------ | | Video_Name | Unique identifier or title of the video file. | | Sort_of_Set | Category or type of the dataset subset (e.g., "Train", "Test"). | | Path_of_QandA | File path to the question-answer pairs. | | Path_of_Con | Path to relationship files. | | Path_of_StreamingPathData | Path to Q&A sequence for streaming evaluation. Each streaming path contains all Q&A sequences in streaming order within the path. | | Path_of_Dialogue | Path to Q&A sequence for dialogue evaluation. Each dialogue contains all Q&A sequences in order within the dialogue. | | Path_of_Streaming | Path to Q&A sequence for streaming evaluation represented only by serial numbers (e.g., the path [[0,0],[1,2],[2,3]...] indicates starting from the 1st question of the 1st chain, then proceeding to the 3rd question of the 2nd chain, and then to the 4th question of the 3rd chain, and so on). | | Path_of_Video | Absolute file path to the raw video file. | | Video_Duration | Total duration of the video in seconds. | | Source_of_Dataset | Origin of the dataset. |
Submit results via https://forms.gle/Rmi6u4WGhyEZ2X7g8.
Submission Instructions:
Important Notes:
Submission Form

See GitHub repository for details.
Semi-automated annotation using a hybrid approach:
If you find our data useful, please consider citing our work!
@article{yang2025svbench,
title={SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding},
author={Yang, Zhenyu and Hu, Yuhang and Du, Zemin and Xue, Dizhan and Qian, Shengsheng and Wu, Jiahong and Yang, Fan and Dong, Weiming and Xu, Changsheng},
journal={arXiv preprint arXiv:2502.10810},
year={2025}
}