Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/BERT-LARGE

BERT-LARGE

Reported on 11 benchmarks across 6 tasks · 2 papers · 6 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing11 results

Common Sense ReasoningonCommonsenseQA
Accuracy· uses extra data· 2018-11-02
55.9
best: 92.54 (GPT-4o (HPT))
SOTA
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge arXiv:1811.00937
Common Sense ReasoningonSWAG
Dev· 2018-10-11
86.6
SOTA
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding arXiv:1810.04805
Common Sense ReasoningonSWAG
Test· 2018-10-11
86.3
best: 90.8 (DeBERTalarge)
SOTA
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding arXiv:1810.04805
Semantic Textual SimilarityonMRPC
F1· 2018-10-11
89.3
best: 92.5 (T5-3B)
SOTA
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding arXiv:1810.04805
Semantic Textual SimilarityonSTS Benchmark
Spearman Correlation· 2018-10-11
0.865
best: 0.931 (Mnet-Sim)
SOTA
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding arXiv:1810.04805
Natural Language UnderstandingonGLUE
Average· 2018-10-11
82.1
best: 89.9 (MT-DNN-SMART)
SOTA
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding arXiv:1810.04805
Natural Language InferenceonMultiNLI
Matched· 2018-10-11
86.7
best: 92.6 (Turing NLR v5 XXL 5.4B (fine-tuned))
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding arXiv:1810.04805
Natural Language InferenceonMultiNLI
Mismatched· 2018-10-11
85.9
best: 92.4 (Turing NLR v5 XXL 5.4B (fine-tuned))
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding arXiv:1810.04805
Semantic Textual SimilarityonQuora Question Pairs
F1· 2018-10-11
72.1
best: 90.7 (ALICE)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding arXiv:1810.04805
Sentiment AnalysisonSST-2 Binary classification
Accuracy· 2018-10-11
94.9
best: 97.5 (T5-11B)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding arXiv:1810.04805
Paraphrase IdentificationonQuora Question Pairs
F1· 2018-10-11
72.1
best: 90.7 (ALICE)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding arXiv:1810.04805