Metric: Execution Accuracy (higher is better)
| # | Model↕ | Execution Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | APOLLO | 78.76 | No | APOLLO: An Optimized Training Approach for Long-... | 2022-12-14 | Code |
| 2 | GPT-4 (8k) | 76.48 | No | Are ChatGPT and GPT-4 General-Purpose Solvers fo... | 2023-05-10 | - |
| 3 | FinQANet (RoBERTa-large) | 68.9 | No | ConvFinQA: Exploring the Chain of Numerical Reas... | 2022-10-07 | Code |
| 4 | FinQANet (RoBERTa-large) | 68.9 | No | ConvFinQA: Exploring the Chain of Numerical Reas... | 2022-10-07 | Code |
| 5 | General Crowd | 46.9 | No | Are ChatGPT and GPT-4 General-Purpose Solvers fo... | 2023-05-10 | - |