Metric: Test Set (Acc-%) (higher is better)
| # | Model↕ | Test Set (Acc-%)▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | Med-PaLM 2 (ER) | 0.723 | No | Towards Expert-Level Medical Question Answering ... | 2023-05-16 | Code |
| 2 | Med-PaLM 2 (CoT+SC) | 0.715 | No | Towards Expert-Level Medical Question Answering ... | 2023-05-16 | Code |
| 3 | Med-PaLM 2 (5-shot) | 0.713 | No | Towards Expert-Level Medical Question Answering ... | 2023-05-16 | Code |
| 4 | VOD (BioLinkBERT) | 0.629 | No | Variational Open-Domain Question Answering | 2022-09-23 | Code |
| 5 | Codex 5-shot CoT | 0.627 | No | Can large language models reason about medical q... | 2022-07-17 | Code |
| 6 | BioMedGPT-10B | 0.514 | No | BioMedGPT: Open Multimodal Generative Pre-traine... | 2023-08-18 | Code |
| 7 | PubmedBERT(Gu et al., 2022) | 0.41 | No | MedMCQA : A Large-scale Multi-Subject Multi-Choi... | 2022-03-27 | Code |
| 8 | SciBERT (Beltagy et al., 2019) | 0.39 | No | MedMCQA : A Large-scale Multi-Subject Multi-Choi... | 2022-03-27 | Code |
| 9 | BioBERT (Lee et al.,2020) | 0.37 | No | MedMCQA : A Large-scale Multi-Subject Multi-Choi... | 2022-03-27 | Code |
| 10 | BERT (Devlin et al., 2019)-Base | 0.33 | No | MedMCQA : A Large-scale Multi-Subject Multi-Choi... | 2022-03-27 | Code |