TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Sentiment Analysis/SST-2 Binary classification

Sentiment Analysis on SST-2 Binary classification

Metric: Accuracy (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕Accuracy▼Extra DataPaperDate↕Code
1T5-11B97.5NoExploring the Limits of Transfer Learning with a...2019-10-23Code
2MT-DNN-SMART97.5NoSMART: Robust and Efficient Fine-Tuning for Pre-...2019-11-08Code
3T5-3B97.4NoExploring the Limits of Transfer Learning with a...2019-10-23Code
4MUPPET Roberta Large97.4NoMuppet: Massive Multi-task Representations with ...2021-01-26Code
5ALBERT97.1NoALBERT: A Lite BERT for Self-supervised Learning...2019-09-26Code
6StructBERTRoBERTa ensemble97.1NoStructBERT: Incorporating Language Structures in...2019-08-13-
7XLNet (single model)97NoXLNet: Generalized Autoregressive Pretraining fo...2019-06-19Code
8ELECTRA96.9NoELECTRA: Pre-training Text Encoders as Discrimin...2020-03-23Code
9RoBERTa-large 355M + Entailment as Few-shot Learner96.9NoEntailment as Few-Shot Learner2021-04-29Code
10XLNet-Large (ensemble)96.8NoXLNet: Generalized Autoregressive Pretraining fo...2019-06-19Code
11FLOATER-large96.7NoLearning to Encode Position for Transformer with...2020-03-13Code
12MUPPET Roberta base96.7NoMuppet: Massive Multi-task Representations with ...2021-01-26Code
13RoBERTa (ensemble)96.7NoRoBERTa: A Robustly Optimized BERT Pretraining A...2019-07-26Code
14DeBERTa (large)96.5NoDeBERTa: Decoding-enhanced BERT with Disentangle...2020-06-05Code
15MT-DNN-ensemble96.5NoImproving Multi-Task Deep Neural Networks via Kn...2019-04-20Code
16RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned)96.4NoLLM.int8(): 8-bit Matrix Multiplication for Tran...2022-08-15Code
17ASA + RoBERTa96.3NoAdversarial Self-Attention for Language Understa...2022-06-25Code
18T5-Large 770M96.3NoExploring the Limits of Transfer Learning with a...2019-10-23Code
19Snorkel MeTaL(ensemble)96.2NoTraining Complex Models with Multi-Task Weak Sup...2018-10-05Code
20PSQ (Chen et al., 2020)96.2NoA Statistical Framework for Low-bitwidth Trainin...2020-10-27Code
21Heinsen Routing + RoBERTa-large96YesAn Algorithm for Routing Vectors in Sequences2022-11-20Code
22MT-DNN95.6NoMulti-Task Deep Neural Networks for Natural Lang...2019-01-31Code
23Heinsen Routing + GPT-295.6YesAn Algorithm for Routing Capsules in All Domains2019-11-02Code
24T5-Base95.2NoExploring the Limits of Transfer Learning with a...2019-10-23Code
25ERNIE 2.0 Base95NoERNIE 2.0: A Continual Pre-training Framework fo...2019-07-29Code
26RoBERTa+DualCL94.91NoDual Contrastive Learning: Text Classification v...2022-01-21Code
27BERT-LARGE94.9NoBERT: Pre-training of Deep Bidirectional Transfo...2018-10-11Code
28RoBERTa + SubRegWeigh (K-means)94.84NoSubRegWeigh: Effective and Efficient Annotation ...2024-09-10Code
29SpanBERT94.8NoSpanBERT: Improving Pre-training by Representing...2019-07-24Code
30gMLP-large94.8NoPay Attention to MLPs2021-05-17Code
31Q-BERT (Shen et al., 2020)94.8NoQ-BERT: Hessian Based Ultra Low Precision Quanti...2019-09-12-
32Q8BERT (Zafrir et al., 2019)94.7NoQ8BERT: Quantized 8Bit BERT2019-10-14Code
33CNN Large94.6NoCloze-driven Pretraining of Self-attention Netwo...2019-03-19-
34BigBird94.6NoBig Bird: Transformers for Longer Sequences2020-07-28Code
35MLM+ del-word+ reorder94.5NoCLEAR: Contrastive Learning for Sentence Represe...2020-12-31-
36ASA + BERT-base94.1NoAdversarial Self-Attention for Language Understa...2022-06-25Code
37RealFormer94.04NoRealFormer: Transformer Likes Residual Attention2020-12-21Code
38FNet-Large94NoFNet: Mixing Tokens with Fourier Transforms2021-05-09Code
39MT-DNN93.6NoSMART: Robust and Efficient Fine-Tuning for Pre-...2019-11-08Code
40ERNIE93.5NoERNIE: Enhanced Language Representation with Inf...2019-05-17Code
41Block-sparse LSTM93.2No--Code
42LM-CPPF RoBERTa-base93.2NoLM-CPPF: Paraphrasing-Guided Data Augmentation f...2023-05-29Code
43TinyBERT-6 67M93.1NoTinyBERT: Distilling BERT for Natural Language U...2019-09-23Code
4424hBERT93NoHow to Train BERT with an Academic Budget2021-04-15Code
45SMART+BERT-BASE93NoSMART: Robust and Efficient Fine-Tuning for Pre-...2019-11-08Code
46TinyBERT-4 14.5M92.6NoTinyBERT: Distilling BERT for Natural Language U...2019-09-23Code
47bmLSTM91.8NoLearning to Generate Reviews and Discovering Sen...2017-04-05Code
48T5-Small91.8NoExploring the Limits of Transfer Learning with a...2019-10-23Code
49byte mLSTM791.7NoA La Carte Embedding: Cheap but Effective Induct...2018-05-14Code
50PAR BERT Base91.6NoPay Attention when Required2020-09-09Code
51Charformer-Base91.6NoCharformer: Fast Character Transformers via Grad...2021-06-23Code
52SqueezeBERT91.4NoSqueezeBERT: What can computer vision teach NLP ...2020-06-19Code
53Nyströmformer91.4NoNyströmformer: A Nyström-Based Algorithm for App...2021-02-07Code
54Bi-CAS-LSTM91.3NoCell-aware Stacked LSTMs for Modeling Sentences2018-09-07-
55DistilBERT 66M91.3NoDistilBERT, a distilled version of BERT: smaller...2019-10-02Code
56CNN91.2NoOn the Role of Text Preprocessing in Neural Netw...2017-07-06Code
57Suffix BiLSTM91.2NoImproved Sentence Modeling using Suffix Bidirect...2018-05-18-
58BERT Base91.2NoFine-grained Sentiment Classification using BERT2019-10-04Code
59Transformer (finetune)90.9NoPractical Text Classification With Large Pre-Tra...2018-12-04Code
60Single layer bilstm distilled from BERT90.7NoDistilling Task-Specific Knowledge from BERT int...2019-03-28Code
61BCN+Char+CoVe90.3NoLearned in Translation: Contextualized Word Vect...2017-08-01Code
62CNN-RNF-LSTM90NoConvolutional Neural Networks with Recurrent Neu...2018-08-28Code
63Neural Semantic Encoder89.7NoNeural Semantic Encoders2016-07-14Code
64BLSTM-2DCNN89.5NoText Classification Improved by Integrating Bidi...2016-11-21Code
65CNN + Logic rules89.3NoHarnessing Deep Neural Networks with Logic Rules2016-03-21Code
66DMN [ankit16]88.6NoAsk Me Anything: Dynamic Memory Networks for Nat...2015-06-24Code
67CNN-multichannel [kim2013]88.1NoConvolutional Neural Networks for Sentence Class...2014-08-25Code
68Consistency Tree LSTM with tuned Glove vectors [tai2015improved]88NoImproved Semantic Representations From Tree-Stru...2015-02-28Code
69C-LSTM87.8NoA C-LSTM Neural Network for Text Classification2015-11-27Code
70MPAD-path87.75NoMessage Passing Attention Networks for Document ...2019-08-17Code
71Standard DR-AGG87.6NoInformation Aggregation via Dynamic Routing for ...2018-06-05Code
72USE_T+CNN (lrn w.e.) 87.21NoUniversal Sentence Encoder2018-03-29Code
73Reverse DR-AGG87.2NoInformation Aggregation via Dynamic Routing for ...2018-06-05Code
74DC-MCNN86.99No---
75STM+TSED+PT+2L86.95NoThe Pupil Has Become the Master: Teacher-Student...2019-05-31Code
76Capsule-B 86.8NoInvestigating Capsule Networks with Dynamic Rout...2018-03-29Code
772-layer LSTM [tai2015improved]86.3NoImproved Semantic Representations From Tree-Stru...2015-02-28Code
78SWEM-concat84.3NoBaseline Needs More Love: On Simple Word-Embeddi...2018-05-24Code
79MV-RNN82.9No--Code
80GloVe+Emo2Vec82.3NoEmo2Vec: Learning Generalized Emotion Representa...2018-09-12Code
81Emo2Vec81.2NoEmo2Vec: Learning Generalized Emotion Representa...2018-09-12Code
82ToWE-CBOW78.8No--Code
83Joined Model Multi-tasking54.72No---