Metric: SODA (higher is better)
| # | Model↕ | SODA▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | GVL | 7.11 | No | Learning Grounded Vision-Language Representation... | 2023-03-11 | Code |
| 2 | CM² | 6.18 | No | Do You Remember? Dense Video Captioning with Cro... | 2024-04-11 | Code |
| 3 | PDVC (TSP features, no SCST) | 6.05 | No | End-to-End Dense Video Captioning with Parallel ... | 2021-08-17 | Code |
| 4 | VTimeLLM | 5.8 | No | VTimeLLM: Empower LLM to Grasp Video Moments | 2023-11-30 | Code |