TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/BERTweet: A pre-trained language model for English Tweets

BERTweet: A pre-trained language model for English Tweets

Dat Quoc Nguyen, Thanh Vu, Anh Tuan Nguyen

2020-05-20EMNLP 2020 11Text ClassificationSentiment AnalysisPart-Of-Speech Taggingnamed-entity-recognitionNamed Entity RecognitionXLM-Rtext-classificationNamed Entity Recognition (NER)Language Modelling
PaperPDFCodeCodeCode(official)

Abstract

We present BERTweet, the first public large-scale pre-trained language model for English Tweets. Our BERTweet, having the same architecture as BERT-base (Devlin et al., 2019), is trained using the RoBERTa pre-training procedure (Liu et al., 2019). Experiments show that BERTweet outperforms strong baselines RoBERTa-base and XLM-R-base (Conneau et al., 2020), producing better performance results than the previous state-of-the-art models on three Tweet NLP tasks: Part-of-speech tagging, Named-entity recognition and text classification. We release BERTweet under the MIT License to facilitate future research and applications on Tweet data. Our BERTweet is available at https://github.com/VinAIResearch/BERTweet

Results

TaskDatasetMetricValueModel
Part-Of-Speech TaggingRitterAcc90.1BERTweet
Part-Of-Speech TaggingTweebankAcc95.2BERTweet
Sentiment AnalysisTweetEvalALL67.9BERTweet
Sentiment AnalysisTweetEvalEmoji33.4BERTweet
Sentiment AnalysisTweetEvalEmotion79.3BERTweet
Sentiment AnalysisTweetEvalIrony82.1BERTweet
Sentiment AnalysisTweetEvalOffensive79.5BERTweet
Sentiment AnalysisTweetEvalSentiment73.4BERTweet
Sentiment AnalysisTweetEvalStance71.2BERTweet
Named Entity Recognition (NER)WNUT 2017F156.5BERTweet
Named Entity Recognition (NER)WNUT 2016F152.1BERTweet

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Making Language Model a Hierarchical Classifier and Generator2025-07-17AdaptiSent: Context-Aware Adaptive Attention for Multimodal Aspect-Based Sentiment Analysis2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Assay2Mol: large language model-based drug design using BioAssay context2025-07-16Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16