Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Question Answering
/
MedQA
Question Answering on MedQA
Metric: Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide extra data
Export CSV
#
Model
↕
Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
Med-Gemini
91.1
Yes
Capabilities of Gemini Models in Medicine
2024-04-29
-
2
GPT-4
90.2
Yes
Can Generalist Foundation Models Outcompete Spec...
2023-11-28
Code
3
Med-PaLM 2
85.4
No
Towards Expert-Level Medical Question Answering ...
2023-05-16
Code
4
Med-PaLM 2 (CoT + SC)
83.7
No
Towards Expert-Level Medical Question Answering ...
2023-05-16
Code
5
Med-PaLM 2 (5-shot)
79.7
No
Towards Expert-Level Medical Question Answering ...
2023-05-16
Code
6
MedMobile (3.8B)
75.7
Yes
MedMobile: A mobile-sized language model with ex...
2024-10-11
Code
7
Meerkat-7B
74.3
Yes
Small Language Models Learn Enhanced Reasoning S...
2024-03-30
-
8
Meerkat-7B (Single)
70.6
Yes
Small Language Models Learn Enhanced Reasoning S...
2024-03-30
-
9
Meditron-70B (CoT + SC)
70.2
No
MEDITRON-70B: Scaling Medical Pretraining for La...
2023-11-27
Code
10
Flan-PaLM (540 B)
67.6
No
Large Language Models Encode Clinical Knowledge
2022-12-26
Code
11
LLAMA-2 (70B SC CoT)
61.5
Yes
MEDITRON-70B: Scaling Medical Pretraining for La...
2023-11-27
Code
12
Shakti-LLM (2.5B)
60.3
No
SHAKTI: A 2.5 Billion Parameter Small Language M...
2024-10-15
-
13
Codex 5-shot CoT
60.2
No
Can large language models reason about medical q...
2022-07-17
Code
14
LLAMA-2 (70B)
59.2
Yes
MEDITRON-70B: Scaling Medical Pretraining for La...
2023-11-27
Code
15
VOD (BioLinkBERT)
55
No
Variational Open-Domain Question Answering
2022-09-23
Code
16
BioMedGPT-10B
50.4
No
BioMedGPT: Open Multimodal Generative Pre-traine...
2023-08-18
Code
17
PubMedGPT (2.7 B)
50.3
No
Large Language Models Encode Clinical Knowledge
2022-12-26
Code
18
DRAGON + BioLinkBERT
47.5
No
Deep Bidirectional Language-Knowledge Graph Pret...
2022-10-17
Code
19
BioLinkBERT (340 M)
45.1
No
Large Language Models Encode Clinical Knowledge
2022-12-26
Code
20
GAL 120B (zero-shot)
44.4
No
Galactica: A Large Language Model for Science
2022-11-16
Code
21
BioLinkBERT (base)
40
No
LinkBERT: Pretraining Language Models with Docum...
2022-03-29
Code
22
GrapeQA: PEGA
39.51
No
GrapeQA: GRaph Augmentation and Pruning to Enhan...
2023-03-22
-
23
BioBERT (large)
36.7
No
BioBERT: a pre-trained biomedical language repre...
2019-01-25
Code
24
BioBERT (base)
34.1
No
BioBERT: a pre-trained biomedical language repre...
2019-01-25
Code
25
GPT-Neo (2.7 B)
33.3
No
Large Language Models Encode Clinical Knowledge
2022-12-26
Code
26
BLOOM (few-shot, k=5)
23.3
No
Galactica: A Large Language Model for Science
2022-11-16
Code
27
OPT (few-shot, k=5)
22.8
No
Galactica: A Large Language Model for Science
2022-11-16
Code