TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/GPT-3 175B (few-shot, k=32)

GPT-3 175B (few-shot, k=32)

Reported on 10 benchmarks across 4 tasks · 1 paper · 1 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing10 results

  • Question AnsweringonCoQA
    Overall· 2020-05-28
    85
    SOTA
    Language Models are Few-Shot LearnersarXiv:2005.14165
  • Question AnsweringonCOPA
    Accuracy· 2020-05-28
    92
    best: 100 (PaLM 540B (finetuned) )
    Language Models are Few-Shot LearnersarXiv:2005.14165
  • Question AnsweringonQuAC
    F1· 2020-05-28
    44.3
    best: 64.1 (FlowQA (single model))
    Language Models are Few-Shot LearnersarXiv:2005.14165
  • Question AnsweringonRACE
    RACE-m· 2020-05-28
    58.1
    best: 85.45 (XLNet)
    Language Models are Few-Shot LearnersarXiv:2005.14165
  • Question AnsweringonBoolQ
    Accuracy· 2020-05-28
    76.4
    best: 99.87 (Mistral-Nemo 12B (HPT))
    Language Models are Few-Shot LearnersarXiv:2005.14165
  • Question AnsweringonDROP Test
    F1· 2020-05-28
    36.5
    best: 88.38 (QDGAT (ensemble))
    Language Models are Few-Shot LearnersarXiv:2005.14165
  • Question AnsweringonOpenBookQA
    Accuracy· 2020-05-28
    65.4
    best: 95.9 (GPT-4 + knowledge base)
    Language Models are Few-Shot LearnersarXiv:2005.14165
  • Word Sense DisambiguationonWords in Context
    Accuracy· 2020-05-28
    49.4
    best: 85.3 (COSINE + Transductive Learning)
    Language Models are Few-Shot LearnersarXiv:2005.14165
  • Natural Language InferenceonCommitmentBank
    F1· 2020-05-28
    52
    best: 100 (PaLM 540B (finetuned))
    Language Models are Few-Shot LearnersarXiv:2005.14165
  • Sentence CompletiononHellaSwag
    Accuracy· 2020-05-28
    79.3
    best: 96.1 (CompassMTL 567M with Tailor)
    Language Models are Few-Shot LearnersarXiv:2005.14165