Metric: F1 (higher is better)
| # | Model↕ | F1▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | Turing NLR v5 XXL 5.4B (fine-tuned) | 96.4 | No | Toward Efficient Language Model Pretraining and ... | 2022-12-04 | - |
| 2 | PaLM 540B (finetuned) | 94.6 | No | PaLM: Scaling Language Modeling with Pathways | 2022-04-05 | Code |
| 3 | DeBERTa-1.5B | 94.5 | No | DeBERTa: Decoding-enhanced BERT with Disentangle... | 2020-06-05 | Code |
| 4 | Vega v2 6B (fine-tuned) | 94.4 | No | Toward Efficient Language Model Pretraining and ... | 2022-12-04 | - |
| 5 | T5-11B | 94.1 | No | Exploring the Limits of Transfer Learning with a... | 2019-10-23 | Code |
| 6 | PaLM 2-L (one-shot) | 93.8 | No | PaLM 2 Technical Report | 2023-05-17 | Code |
| 7 | PaLM 2-M (one-shot) | 92.4 | No | PaLM 2 Technical Report | 2023-05-17 | Code |
| 8 | GESA 500M | 92.2 | No | Integrating a Heterogeneous Graph with Entity-aw... | 2023-07-19 | - |
| 9 | PaLM 2-S (one-shot) | 92.1 | No | PaLM 2 Technical Report | 2023-05-17 | Code |
| 10 | LUKE-Graph | 91.5 | No | LUKE-Graph: A Transformer-based Approach with Ga... | 2023-03-12 | - |
| 11 | LUKE (single model) | 91.209 | No | - | - | - |
| 12 | LUKE 483M | 91.2 | No | LUKE: Deep Contextualized Entity Representations... | 2020-10-02 | Code |
| 13 | GPT-3 175B (one-shot) | 90.2 | No | Large Language Models are Zero-Shot Reasoners | 2022-05-24 | Code |
| 14 | KELM (finetuning RoBERTa-large based single model) | 89.6 | No | KELM: Knowledge Enhanced Pre-Trained Language Re... | 2021-09-09 | Code |
| 15 | AlexaTM 20B | 88.4 | No | AlexaTM 20B: Few-Shot Learning Using a Large-Sca... | 2022-08-02 | Code |
| 16 | XLNet + MTL + Verifier (ensemble) | 83.737 | No | - | - | - |
| 17 | Bloomberg GPT 50B (1-shot) | 82.8 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |
| 18 | XLNet + Verifier | 82.7 | No | - | - | - |
| 19 | XLNet + MTL + Verifier (single model) | 82.664 | No | - | - | - |
| 20 | CSRLM (single model) | 82.584 | No | - | - | - |
| 21 | OPT 66B (1-shot) | 82.5 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |
| 22 | {SKG-NET} (single model) | 80.038 | No | - | - | - |
| 23 | BLOOM 176B (1-shot) | 78 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |
| 24 | KELM (finetuning BERT-large based single model) | 76.7 | No | KELM: Knowledge Enhanced Pre-Trained Language Re... | 2021-09-09 | Code |
| 25 | KT-NET (single model) | 73.62 | No | - | - | - |
| 26 | SKG-BERT (single model) | 72.778 | No | - | - | - |
| 27 | DCReader+BERT (single model) | 71.138 | No | - | - | - |
| 28 | GPT-NeoX 20B (1-shot) | 67.9 | No | BloombergGPT: A Large Language Model for Finance | 2023-03-30 | Code |
| 29 | GraphBert (single) | 62.986 | No | - | - | - |
| 30 | GraphBert-WordNet (single) | 61.885 | No | - | - | - |
| 31 | GraphBert-NELL (single) | 61.515 | No | - | - | - |
| 32 | BERT-Base (single model) | 56.065 | No | BERT: Pre-training of Deep Bidirectional Transfo... | 2018-10-11 | Code |
| 33 | DocQA + ELMo | 46.7 | No | ReCoRD: Bridging the Gap between Human and Machi... | 2018-10-30 | - |
| 34 | N-Grammer 343M | 29.9 | No | N-Grammer: Augmenting Transformers with latent n... | 2022-07-13 | Code |