TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Natural Language Inference/CommitmentBank

Natural Language Inference on CommitmentBank

Metric: Accuracy (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Accuracy▼Extra DataPaperDate↕Code
1PaLM 540B (finetuned)100NoPaLM: Scaling Language Modeling with Pathways2022-04-05Code
2Vega v2 6B (KD-based prompt transfer)99.2NoToward Efficient Language Model Pretraining and ...2022-12-04-
3ST-MoE-L 4.1B (fine-tuned)98.2NoST-MoE: Designing Stable and Transferable Sparse...2022-02-17Code
4ST-MoE-32B 269B (fine-tuned)98NoST-MoE: Designing Stable and Transferable Sparse...2022-02-17Code
5Turing NLR v5 XXL 5.4B (fine-tuned)97.6NoToward Efficient Language Model Pretraining and ...2022-12-04-
6DeBERTa-1.5B97.2NoDeBERTa: Decoding-enhanced BERT with Disentangle...2020-06-05Code
7T5-XXL 11B (fine-tuned)96.8NoExploring the Limits of Transfer Learning with a...2019-10-23Code
8T5-Large 770M (fine-tuned)94.4NoExploring the Limits of Transfer Learning with a...2019-10-23Code
9T5-Base 220M (fine-tuned)94NoExploring the Limits of Transfer Learning with a...2019-10-23Code
10PaLM 2-L (one-shot)87.5NoPaLM 2 Technical Report2023-05-17Code
11PaLM 2-S (one-shot)82.1NoPaLM 2 Technical Report2023-05-17Code
12PaLM 2-M (one-shot)80.4NoPaLM 2 Technical Report2023-05-17Code
13GPT-3 175B (Few-Shot)75.6NoLanguage Models are Few-Shot Learners2020-05-28Code
14N-Grammer 343M67.9NoN-Grammer: Augmenting Transformers with latent n...2022-07-13Code
15AlexaTM 20B67.9NoAlexaTM 20B: Few-Shot Learning Using a Large-Sca...2022-08-02Code
16Bloomberg GPT (one-shot)53.57NoBloombergGPT: A Large Language Model for Finance2023-03-30Code
17GPT-NeoX (one-shot)48.21NoBloombergGPT: A Large Language Model for Finance2023-03-30Code
18BLOOM 176B (one-shot)48.21NoBloombergGPT: A Large Language Model for Finance2023-03-30Code
19OPT 66B (one-shot)44.64NoBloombergGPT: A Large Language Model for Finance2023-03-30Code