TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/Med-PaLM 2 (5-shot)

Med-PaLM 2 (5-shot)

Reported on 9 benchmarks across 1 task · 1 paper · 1 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing9 results

  • Question AnsweringonMMLU (Professional medicine)
    Accuracy· 2023-05-16
    95.2
    SOTA
    Towards Expert-Level Medical Question Answering with Large Language ModelsarXiv:2305.09617
  • Question AnsweringonPubMedQA
    Accuracy· 2023-05-16
    79.2
    best: 81.6 (Meditron-70B (CoT + SC))
    Towards Expert-Level Medical Question Answering with Large Language ModelsarXiv:2305.09617
  • Question AnsweringonMedQA
    Accuracy· 2023-05-16
    79.7
    best: 91.1 (Med-Gemini)
    Towards Expert-Level Medical Question Answering with Large Language ModelsarXiv:2305.09617
  • Question AnsweringonMMLU (Clinical Knowledge)
    Accuracy· 2023-05-16
    88.3
    best: 88.7 (Med-PaLM 2 (ER))
    Towards Expert-Level Medical Question Answering with Large Language ModelsarXiv:2305.09617
  • Question AnsweringonMMLU (College Biology)
    Accuracy· 2023-05-16
    94.4
    best: 95.8 (Med-PaLM 2 (ER))
    Towards Expert-Level Medical Question Answering with Large Language ModelsarXiv:2305.09617
  • Question AnsweringonMMLU (Medical Genetics)
    Accuracy· 2023-05-16
    90
    best: 92 (Med-PaLM 2 (ER))
    Towards Expert-Level Medical Question Answering with Large Language ModelsarXiv:2305.09617
  • Question AnsweringonMedMCQA
    Test Set (Acc-%)· 2023-05-16
    0.713
    best: 0.723 (Med-PaLM 2 (ER))
    Towards Expert-Level Medical Question Answering with Large Language ModelsarXiv:2305.09617
  • Question AnsweringonMMLU (Anatomy)
    Accuracy· 2023-05-16
    77.8
    best: 84.4 (Med-PaLM 2 (ER))
    Towards Expert-Level Medical Question Answering with Large Language ModelsarXiv:2305.09617
  • Question AnsweringonMMLU (College Medicine)
    Accuracy· 2023-05-16
    80.9
    best: 83.2 (Med-PaLM (ER))
    Towards Expert-Level Medical Question Answering with Large Language ModelsarXiv:2305.09617