Explainable Cross-Attention Multimodal RNN

Reported on 4 benchmarks across 3 tasks

Note: results are matched by exact model name. Different papers may use the same name for different model variants.