VCGBench-Diverse on VideoInstruct

Metric: Dense Captioning (higher is better)

LeaderboardDataset

Loading chart...

Results

Submit a result

Sort:

#	Model↕	Dense Captioning▼	Extra Data	Paper	Date↕	Code
1	VideoGPT+	1.38	No	VideoGPT+: Integrating Image and Video Encoders ...	2024-06-13	Code
2	Chat-UniVi	1.33	No	Chat-UniVi: Unified Visual Representation Empowe...	2023-11-14	Code
3	VideoChat2	1.26	No	MVBench: A Comprehensive Multi-modal Video Under...	2023-11-28	Code
4	VTimeLLM	1.13	No	VTimeLLM: Empower LLM to Grasp Video Moments	2023-11-30	Code
5	BT-Adapter	1.03	No	BT-Adapter: Video Conversation is Feasible Witho...	2023-09-27	Code
6	Video-ChatGPT	0.89	No	Video-ChatGPT: Towards Detailed Video Understand...	2023-06-08	Code

#1VideoGPT+SOTA
1.38
Dense Captioning· 2024-06-13
VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding Code
#2Chat-UniViSOTA
1.33
Dense Captioning· 2023-11-14
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding Code
#3VideoChat2
1.26
Dense Captioning· 2023-11-28
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark Code
#4VTimeLLM
1.13
Dense Captioning· 2023-11-30
VTimeLLM: Empower LLM to Grasp Video Moments Code
#5BT-AdapterSOTA
1.03
Dense Captioning· 2023-09-27
BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning Code
#6Video-ChatGPTSOTA
0.89
Dense Captioning· 2023-06-08
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models Code