Metric: R@1,IoU=0.1 (higher is better)
| # | Model↕ | R@1,IoU=0.1▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | ReVisionLLM | 17.3 | No | ReVisionLLM: Recursive Vision-Language Model for... | 2024-11-22 | Code |
| 2 | DeCafNet | 13.25 | No | DeCafNet: Delegate and Conquer for Efficient Tem... | 2025-05-22 | Code |
| 3 | DeCafNet | 13.25 | No | DeCafNet: Delegate and Conquer for Efficient Tem... | 2025-05-22 | Code |
| 4 | RGNet | 12.43 | No | RGNet: A Unified Clip Retrieval and Grounding Ne... | 2023-12-11 | Code |
| 5 | DenoiseLoc | 11.59 | No | Boundary-Denoising for Video Activity Localization | 2023-04-06 | Code |
| 6 | Zero-Shot CLIP + Guidance Model | 9.3 | No | Localizing Moments in Long Video Via Multimodal ... | 2023-02-26 | Code |
| 7 | CLIP | 6.57 | No | MAD: A Scalable Dataset for Language Grounding i... | 2021-12-01 | Code |
| 8 | VLG-Net + Guidance Model | 5.6 | No | Localizing Moments in Long Video Via Multimodal ... | 2023-02-26 | Code |
| 9 | VLG-Net | 3.5 | No | MAD: A Scalable Dataset for Language Grounding i... | 2021-12-01 | Code |
| 10 | Random Chance | 0.09 | No | MAD: A Scalable Dataset for Language Grounding i... | 2021-12-01 | Code |