Metric: Dense Captioning (higher is better)
| # | Model↕ | Dense Captioning▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | VideoGPT+ | 1.38 | No | VideoGPT+: Integrating Image and Video Encoders ... | 2024-06-13 | Code |
| 2 | Chat-UniVi | 1.33 | No | Chat-UniVi: Unified Visual Representation Empowe... | 2023-11-14 | Code |
| 3 | VideoChat2 | 1.26 | No | MVBench: A Comprehensive Multi-modal Video Under... | 2023-11-28 | Code |
| 4 | VTimeLLM | 1.13 | No | VTimeLLM: Empower LLM to Grasp Video Moments | 2023-11-30 | Code |
| 5 | BT-Adapter | 1.03 | No | BT-Adapter: Video Conversation is Feasible Witho... | 2023-09-27 | Code |
| 6 | Video-ChatGPT | 0.89 | No | Video-ChatGPT: Towards Detailed Video Understand... | 2023-06-08 | Code |