TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/Hero w/ pre-training

Hero w/ pre-training

Reported on 8 benchmarks across 3 tasks · 1 paper · 8 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Computer Vision6 results

  • VideoonTVR
    R@1· 2020-05-01
    4.34
    SOTA
    HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-trainingarXiv:2005.00200
  • VideoonTVR
    R@10· 2020-05-01
    13.97
    SOTA
    HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-trainingarXiv:2005.00200
  • VideoonTVR
    R@100· 2020-05-01
    21.78
    SOTA
    HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-trainingarXiv:2005.00200
  • Video RetrievalonTVR
    R@1· 2020-05-01
    4.34
    SOTA
    HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-trainingarXiv:2005.00200
  • Video RetrievalonTVR
    R@10· 2020-05-01
    13.97
    SOTA
    HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-trainingarXiv:2005.00200
  • Video RetrievalonTVR
    R@100· 2020-05-01
    21.78
    SOTA
    HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-trainingarXiv:2005.00200

Reasoning2 results

  • Video Question AnsweringonTVQA
    Accuracy· 2020-05-01
    74.24
    best: 82.2 (LLaMA-VQA)
    SOTA
    HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-trainingarXiv:2005.00200
  • Video Question AnsweringonHow2QA
    Accuracy· 2020-05-01
    77.75
    best: 93.2 (Text + Text (no Multimodal Pretext Training))
    SOTA
    HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-trainingarXiv:2005.00200