Metric: Accuracy (higher is better)
| # | Model↕ | Accuracy▼ | Augmentations | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | PaLM 2 (few-shot, k=3, CoT) | 91.2 | No | PaLM 2 Technical Report | 2023-05-17 | Code |
| 2 | PaLM 2 (few-shot, k=3, Direct) | 61.2 | No | PaLM 2 Technical Report | 2023-05-17 | Code |
| 3 | Chinchilla-70B (few-shot, k=5) | 59.7 | No | Training Compute-Optimal Large Language Models | 2022-03-29 | Code |
| 4 | Gopher-280B (few-shot, k=5) | 49.2 | No | Scaling Language Models: Methods, Analysis & Ins... | 2021-12-08 | Code |
| 5 | PaLM 540B (few-shot, k=3) | 38 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |
| 6 | BLOOM 176B (few-shot, k=3) | 36.8 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |
| 7 | Bloomberg GPT (few-shot, k=3) | 34.8 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |
| 8 | OPT 66B (few-shot, k=3) | 31.2 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |
| 9 | GPT-NeoX (few-shot, k=3) | 26 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |