TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/OPT (few-shot, k=5)

OPT (few-shot, k=5)

Reported on 22 benchmarks across 2 tasks · 1 paper

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing22 results

  • Question AnsweringonMedQA
    Accuracy· 2022-11-16
    22.8
    best: 91.1 (Med-Gemini)
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (Econometrics)
    Accuracy· 2022-11-16
    21
    best: 43 (Gopher (few-shot, k=5))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (College Biology)
    Accuracy· 2022-11-16
    30.6
    best: 95.8 (Med-PaLM 2 (ER))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (Machine Learning)
    Accuracy· 2022-11-16
    28.6
    best: 41.1 (Chinchilla (few-shot, k=5))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (High School Physics)
    Accuracy· 2022-11-16
    29.8
    best: 36.4 (Chinchilla (few-shot, k=5))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (Medical Genetics)
    Accuracy· 2022-11-16
    35
    best: 92 (Med-PaLM 2 (ER))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (High School Computer Science)
    Accuracy· 2022-11-16
    30
    best: 70 (GAL 120B (zero-shot))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (College Chemistry)
    Accuracy· 2022-11-16
    30
    best: 51 (Chinchilla (few-shot, k=5))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (College Computer Science)
    Accuracy· 2022-11-16
    17
    best: 51 (Chinchilla (few-shot, k=5))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (Astronomy)
    Accuracy· 2022-11-16
    23
    best: 73 (Chinchilla (few-shot, k=5))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (Electrical Engineer)
    Accuracy· 2022-11-16
    36.6
    best: 62.8 (GAL 120B (zero-shot))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (Formal Logic)
    Accuracy· 2022-11-16
    29.4
    best: 35.7 (Gopher (few-shot, k=5))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (High School Biology)
    Accuracy· 2022-11-16
    27.7
    best: 80.3 (Chinchilla (few-shot, k=5))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (High School Mathematics)
    Accuracy· 2022-11-16
    24.4
    best: 32.6 (GAL 120B (zero-shot))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMedMCQA
    Dev Set (Acc-%)· 2022-11-16
    0.296
    best: 66 (Meditron-70B (CoT + SC))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (High School Chemistry)
    Accuracy· 2022-11-16
    21.7
    best: 58.1 (Chinchilla (few-shot, k=5))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (Elementary Mathematics)
    Accuracy· 2022-11-16
    25.7
    best: 41.5 (Chinchilla (few-shot, k=5))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (Abstract Algebra)
    Accuracy· 2022-11-16
    21
    best: 33.3 (GAL 30B (zero-shot))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (High School Statistics)
    Accuracy· 2022-11-16
    43.5
    best: 58.8 (Chinchilla (few-shot, k=5))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (College Physics)
    Accuracy· 2022-11-16
    21.6
    best: 46.1 (Chinchilla (few-shot, k=5))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Question AnsweringonMMLU (College Mathematics)
    Accuracy· 2022-11-16
    33
    best: 43 (GAL 120B (zero-shot))
    Galactica: A Large Language Model for SciencearXiv:2211.09085
  • Common Sense ReasoningonARC (Challenge)
    Accuracy· 2022-11-16
    31.1
    best: 96.4 (GPT-4 (few-shot, k=25))
    Galactica: A Large Language Model for SciencearXiv:2211.09085