Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Question Answering
/
PubMedQA
Question Answering on PubMedQA
Metric: Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
#
Model
↕
Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
Meditron-70B (CoT + SC)
81.6
No
MEDITRON-70B: Scaling Medical Pretraining for La...
2023-11-27
Code
2
BioGPT-Large(1.5B)
81
No
BioGPT: Generative Pre-trained Transformer for B...
2022-10-19
Code
3
RankRAG-llama3-70B (Zero-Shot)
79.8
No
RankRAG: Unifying Context Ranking with Retrieval...
2024-07-02
-
4
Med-PaLM 2 (5-shot)
79.2
No
Towards Expert-Level Medical Question Answering ...
2023-05-16
Code
5
Flan-PaLM (540B, Few-shot)
79
No
Large Language Models Encode Clinical Knowledge
2022-12-26
Code
6
BioGPT(345M)
78.2
No
BioGPT: Generative Pre-trained Transformer for B...
2022-10-19
Code
7
Codex 5-shot CoT
78.2
No
Can large language models reason about medical q...
2022-07-17
Code
8
Human Performance (single annotator)
78
No
PubMedQA: A Dataset for Biomedical Research Ques...
2019-09-13
Code
9
MetaGen Blended RAG (zero-shot)
77.9
No
MetaGen Blended RAG: Higher Accuracy for Domain-...
2025-05-23
Code
10
GAL 120B (zero-shot)
77.6
No
Galactica: A Large Language Model for Science
2022-11-16
Code
11
Flan-PaLM (62B, Few-shot)
77.2
No
Large Language Models Encode Clinical Knowledge
2022-12-26
Code
12
MediSwift-XL
76.8
No
MediSwift: Efficient Sparse Pre-trained Biomedic...
2024-03-01
-
13
Flan-T5-XXL
76.8
No
-
-
-
14
BioMedGPT-10B
76.1
No
BioMedGPT: Open Multimodal Generative Pre-traine...
2023-08-18
Code
15
Claude 3 Opus (5-shot)
75.8
No
-
-
-
16
Flan-PaLM (540B, SC)
75.2
No
Large Language Models Encode Clinical Knowledge
2022-12-26
Code
17
Med-PaLM 2 (ER)
75
No
Towards Expert-Level Medical Question Answering ...
2023-05-16
Code
18
Claude 3 Opus (zero-shot)
74.9
No
-
-
-
19
Med-PaLM 2 (CoT + SC)
74
No
Towards Expert-Level Medical Question Answering ...
2023-05-16
Code
20
BLOOM (zero-shot)
73.6
No
Galactica: A Large Language Model for Science
2022-11-16
Code
21
CoT-T5-11B (1024 Shot)
73.42
No
The CoT Collection: Improving Zero-shot and Few-...
2023-05-23
Code
22
BioLinkBERT (large)
72.2
No
LinkBERT: Pretraining Language Models with Docum...
2022-03-29
Code
23
BioLinkBERT (base)
70.2
No
LinkBERT: Pretraining Language Models with Docum...
2022-03-29
Code
24
OPT (zero-shot)
70.2
No
Galactica: A Large Language Model for Science
2022-11-16
Code
25
Flan-PaLM (8B, Few-shot)
67.6
No
Large Language Models Encode Clinical Knowledge
2022-12-26
Code
26
BioELECTRA uncased
64.2
No
-
-
Code
27
PaLM (62B, Few-shot)
57.8
No
Large Language Models Encode Clinical Knowledge
2022-12-26
Code
28
PubMedBERT uncased
55.84
No
Domain-Specific Language Model Pretraining for B...
2020-07-31
Code
29
PaLM (540B, Few-shot)
55
No
Large Language Models Encode Clinical Knowledge
2022-12-26
Code
30
PaLM (8B, Few-shot)
34
No
Large Language Models Encode Clinical Knowledge
2022-12-26
Code