Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/PaLM 2-L (one-shot)

PaLM 2-L (one-shot)

Reported on 14 benchmarks across 6 tasks · 1 paper · 1 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing12 results

Question AnsweringonTriviaQA
EM· uses extra data· 2023-05-17
86.1
best: 87.5 (Claude 2 (few-shot, k=5))
SOTA
PaLM 2 Technical Report arXiv:2305.10403
Question AnsweringonNatural Questions
EM· 2023-05-17
37.5
best: 64 (Atlas (full, Wiki-dec-2018 index))
PaLM 2 Technical Report arXiv:2305.10403
Question AnsweringonStory Cloze
Accuracy· 2023-05-17
87.4
best: 87.8 (Neo-6B (QA + WS))
PaLM 2 Technical Report arXiv:2305.10403
Question AnsweringonMultiRC
F1· 2023-05-17
88.2
best: 90.1 (PaLM 540B (finetuned) )
PaLM 2 Technical Report arXiv:2305.10403
Question AnsweringonWebQuestions
EM· 2023-05-17
28.2
best: 84.6 (PoG-GPT4 (Tan et al., 2024))
PaLM 2 Technical Report arXiv:2305.10403
Question AnsweringonTyDiQA-GoldP
F1· 2023-05-17
73.6
best: 88.5 (U-PaLM 62B (fine-tuned))
PaLM 2 Technical Report arXiv:2305.10403
Common Sense ReasoningonReCoRD
F1· 2023-05-17
93.8
best: 96.4 (Turing NLR v5 XXL 5.4B (fine-tuned))
PaLM 2 Technical Report arXiv:2305.10403
Word Sense DisambiguationonWords in Context
Accuracy· 2023-05-17
66.8
best: 85.3 (COSINE + Transductive Learning)
PaLM 2 Technical Report arXiv:2305.10403
Natural Language InferenceonANLI test
A1· 2023-05-17
73.1
best: 81.8 (T5-3B (explanation prompting))
PaLM 2 Technical Report arXiv:2305.10403
Natural Language InferenceonANLI test
A2· 2023-05-17
63.4
best: 72.5 (T5-3B (explanation prompting))
PaLM 2 Technical Report arXiv:2305.10403
Natural Language InferenceonANLI test
A3· 2023-05-17
67.1
best: 74.8 (T5-3B (explanation prompting))
PaLM 2 Technical Report arXiv:2305.10403
Natural Language InferenceonCommitmentBank
Accuracy· 2023-05-17
87.5
best: 100 (PaLM 540B (finetuned))
PaLM 2 Technical Report arXiv:2305.10403

Medical1 result

Language ModellingonLAMBADA
Accuracy· 2023-05-17
86.9
best: 89.7 (PaLM-540B (Few-Shot))
PaLM 2 Technical Report arXiv:2305.10403

Knowledge Base1 result

Text SummarizationonX-Sum
ROUGE-2· 2023-05-17
23.2
best: 26.7 (Selfmem)
PaLM 2 Technical Report arXiv:2305.10403