TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/PaLI

PaLI

Reported on 36 benchmarks across 3 tasks · 2 papers · 17 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing30 results

  • Visual Question Answering (VQA)onTextVQA test-standard
    overall· 2022-09-14
    73.1
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Visual Question Answering (VQA)onVizWiz 2020 VQA
    overall· 2022-09-14
    73.3
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Visual Question Answering (VQA)onVQA v2 test-dev
    Accuracy· 2022-09-14
    84.3
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps near-domain
    B3· 2022-09-14
    58.99
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps near-domain
    B4· 2022-09-14
    39.98
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps near-domain
    METEOR· 2022-09-14
    33.47
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps near-domain
    ROUGE-L· 2022-09-14
    63.99
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps out-of-domain
    B4· 2022-09-14
    32
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps out-of-domain
    CIDEr· 2022-09-14
    126.67
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps out-of-domain
    METEOR· 2022-09-14
    30.99
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps out-of-domain
    ROUGE-L· 2022-09-14
    61.35
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps in-domain
    CIDEr· 2022-09-14
    149.1
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps in-domain
    METEOR· 2022-09-14
    34.22
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps in-domain
    ROUGE-L· 2022-09-14
    64.39
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Visual Question Answering (VQA)onInfoSeek
    Accuracy· 2023-02-23
    19.7
    best: 30.65 (RA-VQAv2 w/ PreFLMR)
    Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?arXiv:2302.11713
  • Image Captioningonnocaps near-domain
    B1· 2022-09-14
    88.57
    best: 88.9 (GIT2, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps near-domain
    B2· 2022-09-14
    75.56
    best: 75.86 (GIT2, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps near-domain
    CIDEr· 2022-09-14
    124.35
    best: 125.51 (GIT2, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps near-domain
    SPICE· 2022-09-14
    15.75
    best: 16.11 (GIT2, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps near-domain
    SPICE· 2022-09-14
    15.75
    best: 16.11 (GIT2, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps out-of-domain
    B1· 2022-09-14
    86.28
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps out-of-domain
    B2· 2022-09-14
    71.19
    best: 71.28 (GIT, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps out-of-domain
    B3· 2022-09-14
    52.63
    best: 52.66 (GIT, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps out-of-domain
    SPICE· 2022-09-14
    15.49
    best: 15.7 (GIT, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps in-domain
    B1· 2022-09-14
    88.02
    best: 88.86 (GIT2, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps in-domain
    B2· 2022-09-14
    75.21
    best: 76.1 (GIT, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps in-domain
    B3· 2022-09-14
    59.38
    best: 60.53 (GIT, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps in-domain
    B4· 2022-09-14
    41.16
    best: 41.65 (GIT, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps in-domain
    CIDEr· 2022-09-14
    121.09
    best: 149.1
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Image Captioningonnocaps in-domain
    SPICE· 2022-09-14
    15.69
    best: 16.36 (GIT2, Single Model)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794

Computer Vision8 results

  • Zero-Shot Transfer Image ClassificationonObjectNet
    Top 5 Accuracy· 2022-09-14
    58.35
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Zero-Shot Transfer Image ClassificationonImageNet-S
    Accuracy (Private)· 2022-09-14
    63.83
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Zero-Shot Transfer Image ClassificationonImageNet-S
    Top 5 Accuracy· 2022-09-14
    79.3
    SOTA
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Zero-Shot Transfer Image ClassificationonImageNet V2
    Accuracy (Private)· 2022-09-14
    64.46
    best: 81.2 (BASIC (Lion))
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Zero-Shot Transfer Image ClassificationonImageNet-A
    Accuracy (Private)· 2022-09-14
    44.7
    best: 90.2 (CoCa)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Zero-Shot Transfer Image ClassificationonImageNet
    Accuracy (Private)· uses extra data· 2022-09-14
    72.11
    best: 88.5 (M2-Encoder)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Zero-Shot Transfer Image ClassificationonImageNet-R
    Accuracy· 2022-09-14
    81.97
    best: 96.8 (BASIC (Lion))
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794
  • Zero-Shot Transfer Image ClassificationonObjectNet
    Accuracy (Private)· 2022-09-14
    42.62
    best: 87.6 (LiT-22B)
    PaLI: A Jointly-Scaled Multilingual Language-Image ModelarXiv:2209.06794