TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Common Sense Reasoning/ReCoRD

Common Sense Reasoning on ReCoRD

Metric: F1 (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕F1▼Extra DataPaperDate↕Code
1Turing NLR v5 XXL 5.4B (fine-tuned)96.4NoToward Efficient Language Model Pretraining and ...2022-12-04-
2PaLM 540B (finetuned) 94.6NoPaLM: Scaling Language Modeling with Pathways2022-04-05Code
3DeBERTa-1.5B94.5NoDeBERTa: Decoding-enhanced BERT with Disentangle...2020-06-05Code
4Vega v2 6B (fine-tuned)94.4NoToward Efficient Language Model Pretraining and ...2022-12-04-
5T5-11B94.1NoExploring the Limits of Transfer Learning with a...2019-10-23Code
6PaLM 2-L (one-shot)93.8NoPaLM 2 Technical Report2023-05-17Code
7PaLM 2-M (one-shot)92.4NoPaLM 2 Technical Report2023-05-17Code
8GESA 500M92.2NoIntegrating a Heterogeneous Graph with Entity-aw...2023-07-19-
9PaLM 2-S (one-shot)92.1NoPaLM 2 Technical Report2023-05-17Code
10LUKE-Graph91.5NoLUKE-Graph: A Transformer-based Approach with Ga...2023-03-12-
11LUKE (single model)91.209No---
12LUKE 483M91.2NoLUKE: Deep Contextualized Entity Representations...2020-10-02Code
13GPT-3 175B (one-shot)90.2NoLarge Language Models are Zero-Shot Reasoners2022-05-24Code
14KELM (finetuning RoBERTa-large based single model)89.6NoKELM: Knowledge Enhanced Pre-Trained Language Re...2021-09-09Code
15AlexaTM 20B88.4NoAlexaTM 20B: Few-Shot Learning Using a Large-Sca...2022-08-02Code
16XLNet + MTL + Verifier (ensemble)83.737No---
17Bloomberg GPT 50B (1-shot)82.8NoBloombergGPT: A Large Language Model for Finance2023-03-30Code
18XLNet + Verifier82.7No---
19XLNet + MTL + Verifier (single model)82.664No---
20CSRLM (single model)82.584No---
21OPT 66B (1-shot)82.5NoBloombergGPT: A Large Language Model for Finance2023-03-30Code
22{SKG-NET} (single model)80.038No---
23BLOOM 176B (1-shot)78NoBloombergGPT: A Large Language Model for Finance2023-03-30Code
24KELM (finetuning BERT-large based single model)76.7NoKELM: Knowledge Enhanced Pre-Trained Language Re...2021-09-09Code
25KT-NET (single model)73.62No---
26SKG-BERT (single model)72.778No---
27DCReader+BERT (single model)71.138No---
28GPT-NeoX 20B (1-shot)67.9NoBloombergGPT: A Large Language Model for Finance2023-03-30Code
29GraphBert (single)62.986No---
30GraphBert-WordNet (single)61.885No---
31GraphBert-NELL (single)61.515No---
32BERT-Base (single model)56.065NoBERT: Pre-training of Deep Bidirectional Transfo...2018-10-11Code
33DocQA + ELMo46.7NoReCoRD: Bridging the Gap between Human and Machi...2018-10-30-
34N-Grammer 343M29.9NoN-Grammer: Augmenting Transformers with latent n...2022-07-13Code