Metric: Accuracy (higher is better)
| # | Model↕ | Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | Med-PaLM 2 (ER) | 95.8 | No | Towards Expert-Level Medical Question Answering ... | 2023-05-16 | Code |
| 2 | Med-PaLM 2 (CoT + SC) | 95.1 | No | Towards Expert-Level Medical Question Answering ... | 2023-05-16 | Code |
| 3 | Med-PaLM 2 (5-shot) | 94.4 | No | Towards Expert-Level Medical Question Answering ... | 2023-05-16 | Code |
| 4 | Chinchilla (few-shot, k=5) | 79.9 | No | Galactica: A Large Language Model for Science | 2022-11-16 | Code |
| 5 | Gopher (few-shot, k=5) | 70.8 | No | Galactica: A Large Language Model for Science | 2022-11-16 | Code |
| 6 | GAL 120B (zero-shot) | 68.8 | No | Galactica: A Large Language Model for Science | 2022-11-16 | Code |
| 7 | OPT (few-shot, k=5) | 30.6 | No | Galactica: A Large Language Model for Science | 2022-11-16 | Code |
| 8 | BLOOM (few-shot, k=5) | 28.5 | No | Galactica: A Large Language Model for Science | 2022-11-16 | Code |