Metric: R@1,IoU=0.7 (higher is better)
| # | Model↕ | R@1,IoU=0.7▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | GVL (paragraph-level) | 38.55 | No | Learning Grounded Vision-Language Representation... | 2023-03-11 | Code |
| 2 | LLaVA-MR | 35.68 | No | LLaVA-MR: Large Language-and-Vision Assistant fo... | 2024-11-21 | Code |
| 3 | UnLoc-L | 30.2 | No | UnLoc: A Unified Framework for Video Localizatio... | 2023-08-21 | Code |
| 4 | VLG-Net | 29.82 | No | VLG-Net: Video-Language Graph Matching Network f... | 2020-11-19 | Code |
| 5 | UnLoc-B | 29.7 | No | UnLoc: A Unified Framework for Video Localizatio... | 2023-08-21 | Code |
| 6 | GVL | 29.69 | No | Learning Grounded Vision-Language Representation... | 2023-03-11 | Code |
| 7 | DRN | 24.36 | No | Dense Regression Network for Video Grounding | 2020-04-07 | Code |