TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/UNITER

UNITER

Reported on 29 benchmarks across 5 tasks · 3 papers · 6 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing20 results

  • Natural Language InferenceonSNLI-VE val
    Accuracy· 2019-09-25
    78.98
    best: 91 (OFA)
    SOTA
    UNITER: UNiversal Image-TExt Representation LearningarXiv:1909.11740
  • Image-to-Text RetrievalonFlickr30k
    Recall@1· 2023-01-11
    87.3
    best: 97.9 (InternVL-G-FT (finetuned, w/o ranking))
    HADA: A Graph-based Amalgamation Framework in Image-text RetrievalarXiv:2301.04742
  • Image-to-Text RetrievalonFlickr30k
    Recall@10· 2023-01-11
    99.2
    best: 100 (InternVL-G-FT (finetuned, w/o ranking))
    HADA: A Graph-based Amalgamation Framework in Image-text RetrievalarXiv:2301.04742
  • Image-to-Text RetrievalonFlickr30k
    Recall@5· 2023-01-11
    98
    best: 100 (InternVL-G-FT (finetuned, w/o ranking))
    HADA: A Graph-based Amalgamation Framework in Image-text RetrievalarXiv:2301.04742
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Alg.)· 2021-10-25
    49.18
    best: 56.73 (Patch-TRM)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Com.)· 2021-10-25
    83.67
    best: 87 (Patch-TRM)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Cou.)· 2021-10-25
    71.01
    best: 77.81 (Patch-TRM)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Est.)· 2021-10-25
    99.41
    best: 99.54 (Top-Down)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Fra.)· 2021-10-25
    78.37
    best: 82.13 (Patch-TRM)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Geo.)· 2021-10-25
    81.31
    best: 82.61 (ViLT)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Mea.)· 2021-10-25
    99.38
    best: 99.46 (Top-Down)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Pat.)· 2021-10-25
    60.81
    best: 68.75 (Patch-TRM)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Pro.)· 2021-10-25
    87.84
    best: 95.73 (Patch-TRM)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Sce.)· 2021-10-25
    61.25
    best: 68.8 (ViT)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Sen.)· 2021-10-25
    86.1
    best: 92.49 (Patch-TRM)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Spa.)· 2021-10-25
    48.34
    best: 55.62 (Patch-TRM)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Reasoning (Tim.)· 2021-10-25
    69.77
    best: 77.98 (Patch-TRM)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Sub-tasks (Blank)· 2021-10-25
    78.53
    best: 83.62 (Patch-TRM)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Sub-tasks (Img.)· 2021-10-25
    78.71
    best: 82.66 (Patch-TRM)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214
  • Visual Question Answering (VQA)onIconQA
    Sub-tasks (Txt.)· 2021-10-25
    72.39
    best: 75.19 (Patch-TRM)
    IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningarXiv:2110.13214

Miscellaneous6 results

  • Image Retrieval with Multi-Modal QueryonFlickr30k
    Image-to-text R@1· 2019-09-25
    80.7
    best: 98.8 (X2-VLM (large))
    SOTA
    UNITER: UNiversal Image-TExt Representation LearningarXiv:1909.11740
  • Image Retrieval with Multi-Modal QueryonFlickr30k
    Image-to-text R@5· 2019-09-25
    95.7
    best: 100 (X2-VLM (large))
    SOTA
    UNITER: UNiversal Image-TExt Representation LearningarXiv:1909.11740
  • Image Retrieval with Multi-Modal QueryonFlickr30k
    Text-to-image R@1· 2019-09-25
    66.2
    best: 93.3 (ERNIE-ViL 2.0)
    SOTA
    UNITER: UNiversal Image-TExt Representation LearningarXiv:1909.11740
  • Image Retrieval with Multi-Modal QueryonFlickr30k
    Text-to-image R@10· 2019-09-25
    92.9
    best: 99.8 (ERNIE-ViL 2.0)
    SOTA
    UNITER: UNiversal Image-TExt Representation LearningarXiv:1909.11740
  • Image Retrieval with Multi-Modal QueryonFlickr30k
    Text-to-image R@5· 2019-09-25
    88.4
    best: 99.5 (M2-Encoder)
    SOTA
    UNITER: UNiversal Image-TExt Representation LearningarXiv:1909.11740
  • Image Retrieval with Multi-Modal QueryonFlickr30k
    Image-to-text R@10· 2023-01-11
    98
    best: 100 (X2-VLM (large))
    HADA: A Graph-based Amalgamation Framework in Image-text RetrievalarXiv:2301.04742

Computer Vision3 results

  • Image RetrievalonFlickr30k
    Recall@1· 2023-01-11
    75.56
    best: 89.7 (BLIP-2 ViT-G (zero-shot, 1K test set))
    HADA: A Graph-based Amalgamation Framework in Image-text RetrievalarXiv:2301.04742
  • Image RetrievalonFlickr30k
    Recall@10· 2023-01-11
    96.76
    best: 98.9 (BLIP-2 ViT-G (zero-shot, 1K test set))
    HADA: A Graph-based Amalgamation Framework in Image-text RetrievalarXiv:2301.04742
  • Image RetrievalonFlickr30k
    Recall@5· 2023-01-11
    94.08
    best: 98.1 (BLIP-2 ViT-G (zero-shot, 1K test set))
    HADA: A Graph-based Amalgamation Framework in Image-text RetrievalarXiv:2301.04742