TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/LAFF

LAFF

Reported on 35 benchmarks across 3 tasks · 1 paper · 17 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision35 results

  • VideoonVATEX
    text-to-video R@1· 2021-12-03
    59.1
    best: 87.7 (GRAM)
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonVATEX
    text-to-video R@10· 2021-12-03
    91.7
    best: 100 (GRAM)
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonVATEX
    text-to-video R@50· 2021-12-03
    96.3
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonTGIF
    text-to-video R@1· 2021-12-03
    24.5
    best: 25.5 (MDMMT-2)
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonTGIF
    text-to-video R@10· 2021-12-03
    54.5
    best: 55.7 (MDMMT-2)
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonTGIF
    text-to-video R@5· 2021-12-03
    45
    best: 46.1 (MDMMT-2)
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Ad-hoc video searchonTRECVID-AVS20 (V3C1)
    infAP· uses extra data· 2021-12-03
    0.265
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Ad-hoc video searchonTRECVID-AVS17 (IACC.3)
    infAP· uses extra data· 2021-12-03
    0.29
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Ad-hoc video searchonTRECVID-AVS18 (IACC.3)
    infAP· uses extra data· 2021-12-03
    0.147
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Ad-hoc video searchonTRECVID-AVS16 (IACC.3)
    infAP· uses extra data· 2021-12-03
    0.222
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Ad-hoc video searchonTRECVID-AVS19 (V3C1)
    infAP· uses extra data· 2021-12-03
    0.192
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonVATEX
    text-to-video R@1· 2021-12-03
    59.1
    best: 87.7 (GRAM)
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonVATEX
    text-to-video R@10· 2021-12-03
    91.7
    best: 100 (GRAM)
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonVATEX
    text-to-video R@50· 2021-12-03
    96.3
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonTGIF
    text-to-video R@1· 2021-12-03
    24.5
    best: 25.5 (MDMMT-2)
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonTGIF
    text-to-video R@10· 2021-12-03
    54.5
    best: 55.7 (MDMMT-2)
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonTGIF
    text-to-video R@5· 2021-12-03
    45
    best: 46.1 (MDMMT-2)
    SOTA
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonMSR-VTT-1kA
    text-to-video R@1· 2021-12-03
    45.8
    best: 62.9 (HunYuan_tvr (huge))
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonMSR-VTT-1kA
    text-to-video R@10· 2021-12-03
    82
    best: 90.8 (HunYuan_tvr (huge))
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonMSR-VTT-1kA
    text-to-video R@5· 2021-12-03
    71.5
    best: 84.5 (HunYuan_tvr (huge))
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonMSR-VTT
    text-to-video R@1· 2021-12-03
    29.1
    best: 64 (GRAM)
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonMSR-VTT
    text-to-video R@10· 2021-12-03
    65.8
    best: 89.6 (VAST)
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonMSR-VTT
    text-to-video R@5· 2021-12-03
    54.9
    best: 84.3 (VAST)
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonMSVD
    text-to-video R@1· 2021-12-03
    45.4
    best: 61.4 (InternVideo2-6B)
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonMSVD
    text-to-video R@10· 2021-12-03
    84.6
    best: 90.3 (HunYuan_tvr (huge))
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • VideoonMSVD
    text-to-video R@5· 2021-12-03
    76
    best: 87.6 (CAMoE)
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonMSR-VTT-1kA
    text-to-video R@1· 2021-12-03
    45.8
    best: 62.9 (HunYuan_tvr (huge))
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonMSR-VTT-1kA
    text-to-video R@10· 2021-12-03
    82
    best: 90.8 (HunYuan_tvr (huge))
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonMSR-VTT-1kA
    text-to-video R@5· 2021-12-03
    71.5
    best: 84.5 (HunYuan_tvr (huge))
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonMSR-VTT
    text-to-video R@1· 2021-12-03
    29.1
    best: 64 (GRAM)
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonMSR-VTT
    text-to-video R@10· 2021-12-03
    65.8
    best: 89.6 (VAST)
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonMSR-VTT
    text-to-video R@5· 2021-12-03
    54.9
    best: 84.3 (VAST)
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonMSVD
    text-to-video R@1· 2021-12-03
    45.4
    best: 61.4 (InternVideo2-6B)
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonMSVD
    text-to-video R@10· 2021-12-03
    84.6
    best: 90.3 (HunYuan_tvr (huge))
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832
  • Video RetrievalonMSVD
    text-to-video R@5· 2021-12-03
    76
    best: 87.6 (CAMoE)
    Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video RetrievalarXiv:2112.01832