TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/RoBERTa

RoBERTa

Reported on 67 benchmarks across 26 tasks · 10 papers · 16 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing36 results

  • Text ClassificationonNICE-2
    Accuracy· 2022-11-30
    99.76
    SOTA
    Transformers are Short Text Classifiers: A Study of Inductive Short Text Classifiers on Benchmarks and Real-world DatasetsarXiv:2211.16878
  • Binary text classificationonTURINGBENCH (Turing Test, FAIR_wmt20)
    F1 score· 2021-09-27
    0.4531
    best: 0.9966 (GigaCheck (Mistral-7B))
    SOTA
    TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text GenerationarXiv:2109.13296
  • Binary text classificationonTURINGBENCH (Turing Test, GPT-3)
    F1 score· 2021-09-27
    0.5209
    best: 0.9709 (GigaCheck (Mistral-7B))
    SOTA
    TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text GenerationarXiv:2109.13296
  • Reading ComprehensiononRACE
    Accuracy· 2019-07-26
    83.2
    best: 91.4 (ALBERT (Ensemble))
    SOTA
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Common Sense ReasoningonSWAG
    Test· 2019-07-26
    89.9
    best: 90.8 (DeBERTalarge)
    SOTA
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Text ClassificationonarXiv-10
    Accuracy· 2019-07-26
    0.779
    best: 0.794 (Protoformer)
    SOTA
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Text ClassificationonUK Key Stage Readability
    F1· 2024-11-26
    73.1
    best: 99.6 (ELECTRA + ANN)
    What Differentiates Educational Literature? A Multimodal Fusion Approach of Transformers and Computational LinguisticsarXiv:2411.17593
  • Text ClassificationonMR
    Accuracy· 2022-11-30
    89.42
    best: 93.3 (VLAWE)
    Transformers are Short Text Classifiers: A Study of Inductive Short Text Classifiers on Benchmarks and Real-world DatasetsarXiv:2211.16878
  • Relation ExtractiononSemEval-2010 Task-8
    F1· 2022-08-20
    88.7
    best: 91.9 (SP)
    SPOT: Knowledge-Enhanced Language Representations for Information ExtractionarXiv:2208.09625
  • Natural Language UnderstandingonLexGLUE
    CaseHOLD· 2021-10-03
    71.7
    best: 75.6 (CaseLaw-BERT)
    LexGLUE: A Benchmark Dataset for Legal Language Understanding in EnglisharXiv:2110.00976
  • Code GenerationonCodeSearchNet
    Smoothed BLEU-4· 2020-02-19
    14.52
    best: 15.99 (CodeBERT (MLM+RTD))
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code GenerationonCodeSearchNet - Python
    Smoothed BLEU-4· 2020-02-19
    14.92
    best: 20.39 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code GenerationonCodeSearchNet - Go
    Smoothed BLEU-4· 2020-02-19
    26.09
    best: 26.79 (CodeBERT (MLM))
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code GenerationonCodeSearchNet - JavaScript
    Smoothed BLEU-4· 2020-02-19
    5.72
    best: 25.61 (Transformer)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code GenerationonCodeSearchNet - Php
    Smoothed BLEU-4· 2020-02-19
    19.9
    best: 26.23 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code GenerationonCodeSearchNet - Java
    Smoothed BLEU-4· 2020-02-19
    13.2
    best: 21.87 (CodeTrans-MT-Large)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code GenerationonCodeSearchNet - Ruby
    Smoothed BLEU-4· 2020-02-19
    7.26
    best: 15.26 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Relation ExtractiononTACRED
    F1· 2020-02-05
    71.3
    best: 86.6 (RAG4RE)
    K-Adapter: Infusing Knowledge into Pre-Trained Models with AdaptersarXiv:2002.01808
  • Relation ClassificationonTACRED
    F1· 2020-02-05
    71.3
    best: 76.8 (DeepStruct multi-task w/ finetune)
    K-Adapter: Infusing Knowledge into Pre-Trained Models with AdaptersarXiv:2002.01808
  • Negation Detectionon*sem 2012 Shared Task: Sherlock Dataset
    F1· uses extra data· 2020-01-09
    91.59
    best: 97.26 (NegBioELECTRA)
    Resolving the Scope of Speculation and Negation using Transformer-Based ArchitecturesarXiv:2001.02885
  • Reading ComprehensiononRACE
    Accuracy (High)· 2019-07-26
    81.3
    best: 92.6 (ALBERTxxlarge+DUMA(ensemble))
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Reading ComprehensiononRACE
    Accuracy (Middle)· 2019-07-26
    86.5
    best: 93.1 (Megatron-BERT (ensemble))
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Natural Language InferenceonMultiNLI
    Matched· 2019-07-26
    90.8
    best: 92.6 (Turing NLR v5 XXL 5.4B (fine-tuned))
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Semantic Textual SimilarityonSTS Benchmark
    Pearson Correlation· 2019-07-26
    0.922
    best: 0.929 (MT-DNN-SMART)
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Question AnsweringonCronQuestions
    Hits@1
    22.5
    best: 97.8 (GenTKGQA)
  • Abuse DetectiononHopeEDI
    Weighted Average F1-score
    0.93
  • Abuse DetectiononHopeEDI
    Weighted Average F1-score
    0.93
  • Hate Speech DetectiononHopeEDI
    Weighted Average F1-score
    0.93
  • Hate Speech DetectiononHopeEDI
    Weighted Average F1-score
    0.93
  • Cross-LingualonReddit Ideological and Extreme Bias Dataset
    weighted-F1 score
    75.2
    best: 79.1 (SVM)
  • Abstractive Text SummarizationonEDUsum
    ROUGE-1
    63.22
    best: 64.48 (GP_Step_Sim)
  • Abstractive Text SummarizationonEDUsum
    ROUGE-2
    51.34
    best: 52.7 (GP_Step_Sim)
  • Abstractive Text SummarizationonEDUsum
    ROUGE-L
    60.26
    best: 61.91 (GP_Step_Sim)
  • Cross-Lingual Document ClassificationonReddit Ideological and Extreme Bias Dataset
    weighted-F1 score
    75.2
    best: 79.1 (SVM)
  • Hope Speech DetectiononHopeEDI
    Weighted Average F1-score
    0.93
  • Hope Speech DetectiononHopeEDI
    Weighted Average F1-score
    0.93

Computer Code15 results

  • Program SynthesisonManyTypes4TypeScript
    Average Accuracy· 2019-07-26
    59.84
    best: 71.27 (CodeTIDAL5)
    SOTA
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Program SynthesisonManyTypes4TypeScript
    Average F1· 2019-07-26
    57.54
    best: 60.57 (GraphCodeBERT)
    SOTA
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Program SynthesisonManyTypes4TypeScript
    Average Precision· 2019-07-26
    57.45
    best: 60.06 (GraphCodeBERT)
    SOTA
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Program SynthesisonManyTypes4TypeScript
    Average Recall· 2019-07-26
    57.62
    best: 61.08 (GraphCodeBERT)
    SOTA
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Type predictiononManyTypes4TypeScript
    Average Accuracy· 2019-07-26
    59.84
    best: 71.27 (CodeTIDAL5)
    SOTA
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Type predictiononManyTypes4TypeScript
    Average F1· 2019-07-26
    57.54
    best: 60.57 (GraphCodeBERT)
    SOTA
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Type predictiononManyTypes4TypeScript
    Average Precision· 2019-07-26
    57.45
    best: 60.06 (GraphCodeBERT)
    SOTA
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Type predictiononManyTypes4TypeScript
    Average Recall· 2019-07-26
    57.62
    best: 61.08 (GraphCodeBERT)
    SOTA
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • Code Documentation GenerationonCodeSearchNet
    Smoothed BLEU-4· 2020-02-19
    14.52
    best: 15.99 (CodeBERT (MLM+RTD))
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code Documentation GenerationonCodeSearchNet - Python
    Smoothed BLEU-4· 2020-02-19
    14.92
    best: 20.39 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code Documentation GenerationonCodeSearchNet - Go
    Smoothed BLEU-4· 2020-02-19
    26.09
    best: 26.79 (CodeBERT (MLM))
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code Documentation GenerationonCodeSearchNet - JavaScript
    Smoothed BLEU-4· 2020-02-19
    5.72
    best: 25.61 (Transformer)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code Documentation GenerationonCodeSearchNet - Php
    Smoothed BLEU-4· 2020-02-19
    19.9
    best: 26.23 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code Documentation GenerationonCodeSearchNet - Java
    Smoothed BLEU-4· 2020-02-19
    13.2
    best: 21.87 (CodeTrans-MT-Large)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Code Documentation GenerationonCodeSearchNet - Ruby
    Smoothed BLEU-4· 2020-02-19
    7.26
    best: 15.26 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155

Methodology9 results

  • ClassificationonNICE-2
    Accuracy· 2022-11-30
    99.76
    SOTA
    Transformers are Short Text Classifiers: A Study of Inductive Short Text Classifiers on Benchmarks and Real-world DatasetsarXiv:2211.16878
  • ClassificationonarXiv-10
    Accuracy· 2019-07-26
    0.779
    best: 0.794 (Protoformer)
    SOTA
    RoBERTa: A Robustly Optimized BERT Pretraining ApproacharXiv:1907.11692
  • ClassificationonUK Key Stage Readability
    F1· 2024-11-26
    73.1
    best: 99.6 (ELECTRA + ANN)
    What Differentiates Educational Literature? A Multimodal Fusion Approach of Transformers and Computational LinguisticsarXiv:2411.17593
  • Data MiningonIMDb Movie Reviews
    Accuracy· 2023-08-07
    95.3
    best: 95.6 (ELECTRA)
    Analysis of the Evolution of Advanced Transformer-Based Language Models: Experiments on Opinion MiningarXiv:2308.03235
  • Data MiningonIMDb Movie Reviews
    F1· 2023-08-07
    95.3
    best: 95.6 (ELECTRA)
    Analysis of the Evolution of Advanced Transformer-Based Language Models: Experiments on Opinion MiningarXiv:2308.03235
  • Interpretable Machine LearningonIMDb Movie Reviews
    Accuracy· 2023-08-07
    95.3
    best: 95.6 (ELECTRA)
    Analysis of the Evolution of Advanced Transformer-Based Language Models: Experiments on Opinion MiningarXiv:2308.03235
  • Interpretable Machine LearningonIMDb Movie Reviews
    F1· 2023-08-07
    95.3
    best: 95.6 (ELECTRA)
    Analysis of the Evolution of Advanced Transformer-Based Language Models: Experiments on Opinion MiningarXiv:2308.03235
  • ClassificationonMR
    Accuracy· 2022-11-30
    89.42
    best: 93.3 (VLAWE)
    Transformers are Short Text Classifiers: A Study of Inductive Short Text Classifiers on Benchmarks and Real-world DatasetsarXiv:2211.16878
  • ClassificationonReddit Ideology Database
    F1-score (Weighted)
    78.13
    best: 86.19 (SVM)

Adversarial7 results

  • Text GenerationonCodeSearchNet
    Smoothed BLEU-4· 2020-02-19
    14.52
    best: 15.99 (CodeBERT (MLM+RTD))
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Text GenerationonCodeSearchNet - Python
    Smoothed BLEU-4· 2020-02-19
    14.92
    best: 20.39 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Text GenerationonCodeSearchNet - Go
    Smoothed BLEU-4· 2020-02-19
    26.09
    best: 26.79 (CodeBERT (MLM))
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Text GenerationonCodeSearchNet - JavaScript
    Smoothed BLEU-4· 2020-02-19
    5.72
    best: 25.61 (Transformer)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Text GenerationonCodeSearchNet - Php
    Smoothed BLEU-4· 2020-02-19
    19.9
    best: 26.23 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Text GenerationonCodeSearchNet - Java
    Smoothed BLEU-4· 2020-02-19
    13.2
    best: 21.87 (CodeTrans-MT-Large)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155
  • Text GenerationonCodeSearchNet - Ruby
    Smoothed BLEU-4· 2020-02-19
    7.26
    best: 15.26 (CodeTrans-MT-Base)
    CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesarXiv:2002.08155

Knowledge Base3 results

  • Text SummarizationonEDUsum
    ROUGE-1
    63.22
    best: 64.48 (GP_Step_Sim)
  • Text SummarizationonEDUsum
    ROUGE-2
    51.34
    best: 52.7 (GP_Step_Sim)
  • Text SummarizationonEDUsum
    ROUGE-L
    60.26
    best: 61.91 (GP_Step_Sim)