Metric: CIDEr (higher is better)
| # | Model↕ | CIDEr▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | HiCM² | 71.84 | Yes | HiCM$^2$: Hierarchical Compact Memory Modeling f... | 2024-12-19 | Code |
| 2 | Vid2Seq (HowTo100M+VidChapters-7M PT) | 67.2 | Yes | - | - | - |
| 3 | Vid2Seq | 47.1 | Yes | Vid2Seq: Large-Scale Pretraining of a Visual Lan... | 2023-02-27 | Code |
| 4 | CM² | 31.66 | No | Do You Remember? Dense Video Captioning with Cro... | 2024-04-11 | Code |
| 5 | GVL | 26.52 | No | Learning Grounded Vision-Language Representation... | 2023-03-11 | Code |
| 6 | PDVC (TSN features, no SCST) | 22.71 | No | End-to-End Dense Video Captioning with Parallel ... | 2021-08-17 | Code |
| 7 | Vid2Seq (HowTo100M+VidChapters-7M PT) | 13.3 | Yes | - | - | - |