MAViL (Audio-Visual, single)
Reported on 2 benchmarks across 2 tasks
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Audio1 result
- Test mAP· uses extra data0.533best: 0.558 (OmniVec2)
Methodology1 result
- Test mAP· uses extra data0.533best: 0.558 (OmniVec2)