Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/LLaVA-MR

LLaVA-MR

Reported on 9 benchmarks across 2 tasks · 1 paper · 2 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision9 results

Moment RetrievalonQVHighlights
R@1 IoU=0.5· 2024-11-21
76.59
SOTA
LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval arXiv:2411.14505
Moment RetrievalonQVHighlights
R@1 IoU=0.7· 2024-11-21
61.48
SOTA
LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval arXiv:2411.14505
VideoonActivityNet Captions
R@1,IoU=0.5· 2024-11-21
55.16
best: 60.67 (GVL (paragraph-level))
LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval arXiv:2411.14505
VideoonActivityNet Captions
R@1,IoU=0.7· 2024-11-21
35.68
best: 38.55 (GVL (paragraph-level))
LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval arXiv:2411.14505
Moment RetrievalonCharades-STA
R@1 IoU=0.5· 2024-11-21
70.65
best: 71.1 (SG-DETR (w/ PT))
LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval arXiv:2411.14505
Moment RetrievalonCharades-STA
R@1 IoU=0.7· 2024-11-21
49.58
best: 52.8 (SG-DETR (w/ PT))
LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval arXiv:2411.14505
Moment RetrievalonQVHighlights
mAP· 2024-11-21
52.73
best: 58.8 (SG-DETR (w/ PT))
LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval arXiv:2411.14505
Moment RetrievalonQVHighlights
mAP@0.5· 2024-11-21
69.41
best: 76.2 (SG-DETR (w/ PT))
LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval arXiv:2411.14505
Moment RetrievalonQVHighlights
mAP@0.75· 2024-11-21
54.4
best: 60.8 (SG-DETR (w/ PT))
LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval arXiv:2411.14505