Metric: Accuracy (higher is better)
| # | Model↕ | Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | PaLM 2 (few-shot, k=3, Direct) | 62 | No | PaLM 2 Technical Report | 2023-05-17 | Code |
| 2 | PaLM 540B (few-shot, k=3) | 61 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |
| 3 | PaLM 2 (few-shot, k=3, CoT) | 58.8 | No | PaLM 2 Technical Report | 2023-05-17 | Code |
| 4 | Chinchilla-70B (few-shot, k=5) | 57.4 | No | Training Compute-Optimal Large Language Models | 2022-03-29 | Code |
| 5 | GPT-NeoX 20B (few-shot, k=3) | 52.41 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |
| 6 | BLOOM 176B (few-shot, k=3) | 51.87 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |
| 7 | OPT 66B (few-shot, k=3) | 51.87 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |
| 8 | Gopher-280B (few-shot, k=5) | 50.8 | No | Scaling Language Models: Methods, Analysis & Ins... | 2021-12-08 | Code |
| 9 | BloombergGPT 50B (few-shot, k=3) | 49.73 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |