TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/Prismer

Prismer

Reported on 19 benchmarks across 2 tasks · 1 paper · 2 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing19 results

  • Image Captioningonnocaps val
    CIDEr· 2023-03-04
    107.9
    SOTA
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image Captioningonnocaps val
    SPICE· 2023-03-04
    14.8
    SOTA
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Visual Question Answering (VQA)onVQA v2 test-dev
    Accuracy· 2023-03-04
    78.43
    best: 84.3 (PaLI)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Visual Question Answering (VQA)onVQA v2 test-std
    number· 2023-03-04
    61.39
    best: 72.24 (ONE-PEACE)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Visual Question Answering (VQA)onVQA v2 test-std
    other· 2023-03-04
    69.7
    best: 77.02 (mPLUG-Huge)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Visual Question Answering (VQA)onVQA v2 test-std
    overall· 2023-03-04
    78.49
    best: 84.03 (BEiT-3)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Visual Question Answering (VQA)onVQA v2 test-std
    yes/no· 2023-03-04
    93.09
    best: 94.85 (ONE-PEACE)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image Captioningonnocaps entire
    B1· 2023-03-04
    84.87
    best: 88.1 (GIT, Single Model)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image Captioningonnocaps entire
    B2· 2023-03-04
    69.99
    best: 74.81 (GIT, Single Model)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image Captioningonnocaps entire
    B3· 2023-03-04
    52.48
    best: 57.68 (GIT, Single Model)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image Captioningonnocaps entire
    B4· 2023-03-04
    33.66
    best: 37.71 (CoCa - Google Brain)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image Captioningonnocaps entire
    CIDEr· 2023-03-04
    110.84
    best: 126.8 (Lyrics)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image Captioningonnocaps entire
    METEOR· 2023-03-04
    31.13
    best: 32.5 (GIT, Single Model)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image Captioningonnocaps entire
    ROUGE-L· 2023-03-04
    60.55
    best: 63.12 (GIT, Single Model)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image Captioningonnocaps entire
    SPICE· 2023-03-04
    14.91
    best: 15.94 (GIT, Single Model)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image CaptioningonCOCO Captions
    BLEU-4· 2023-03-04
    40.4
    best: 46.5 (mPLUG)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image CaptioningonCOCO Captions
    CIDER· 2023-03-04
    136.5
    best: 155.1 (mPLUG)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image CaptioningonCOCO Captions
    METEOR· 2023-03-04
    31.4
    best: 33.9 (CoCa)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506
  • Image CaptioningonCOCO Captions
    SPICE· 2023-03-04
    24.4
    best: 27 (VAST)
    Prismer: A Vision-Language Model with Multi-Task ExpertsarXiv:2303.02506