TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

SotA/Natural Language Processing/Natural Language Inference/SNLI

Natural Language Inference on SNLI

Metric: % Test Accuracy (higher is better)

LeaderboardDataset
Loading chart...

Results

Submit a result
#Model↕% Test Accuracy▼Extra DataPaperDate↕Code
1UnitedSynT5 (3B)94.7YesFirst Train to Generate, then Generate to Train:...2024-12-12-
2UnitedSynT5 (335M)93.5YesFirst Train to Generate, then Generate to Train:...2024-12-12-
3Neural Tree Indexers for Text Understanding93.1NoEntailment as Few-Shot Learner2021-04-29Code
4EFL (Entailment as Few-shot Learner) + RoBERTa-large93.1NoEntailment as Few-Shot Learner2021-04-29Code
5RoBERTa-large+Self-Explaining92.3NoSelf-Explaining Structures Improve NLP Models2020-12-03Code
6RoBERTa-large + self-explaining layer92.3NoSelf-Explaining Structures Improve NLP Models2020-12-03Code
7CA-MTL92.1NoConditionally Adaptive Multi-Task Learning: Impr...2020-09-19Code
8SemBERT91.9NoSemantics-aware BERT for Language Understanding2019-09-05Code
9MT-DNN-SMARTLARGEv091.7NoSMART: Robust and Efficient Fine-Tuning for Pre-...2019-11-08Code
10MT-DNN91.6NoMulti-Task Deep Neural Networks for Natural Lang...2019-01-31Code
11SJRC (BERT-Large +SRL)91.3NoExplicit Contextual Semantics for Text Comprehen...2018-09-08-
12Ntumpha90.5NoMulti-Task Deep Neural Networks for Natural Lang...2019-01-31Code
13Densely-Connected Recurrent and Co-Attentive Network Ensemble90.1NoSemantic Sentence Matching with Densely-connecte...2018-05-29-
14MFAE90.07No--Code
15Fine-Tuned LM-Pretrained Transformer89.9No--Code
16300D DMAN Ensemble89.6NoDiscourse Marker Augmented Network with Reinforc...2019-07-23Code
17300D DMAN Ensemble89.6NoDiscourse Marker Augmented Network with Reinforc...2019-07-23Code
18150D Multiway Attention Network Ensemble89.4No--Code
19450D DR-BiLSTM Ensemble89.3NoDR-BiLSTM: Dependent Reading Bidirectional LSTM ...2018-02-15-
20300D CAFE Ensemble89.3NoCompare, Compress and Propagate: Enhancing Neura...2017-12-30-
21ESIM + ELMo Ensemble89.3NoDeep contextualized word representations2018-02-15Code
22KIM Ensemble89.1NoNeural Natural Language Inference Models Enhance...2017-11-12Code
23SLRC89.1NoExplicit Contextual Semantics for Text Comprehen...2018-09-08-
24RE288.9NoSimple and Effective Text Matching with Richer A...2019-08-01Code
25Densely-Connected Recurrent and Co-Attentive Network88.9NoSemantic Sentence Matching with Densely-connecte...2018-05-29-
26DEIM88.9NoDEIM: An effective deep encoding and interaction...2022-03-20-
27448D Densely Interactive Inference Network (DIIN, code) Ensemble88.9NoNatural Language Inference over Interaction Space2017-09-13Code
28300D DMAN88.8NoDiscourse Marker Augmented Network with Reinforc...2019-07-23Code
29300D DMAN88.8NoDiscourse Marker Augmented Network with Reinforc...2019-07-23Code
30BiMPM Ensemble88.8NoBilateral Multi-Perspective Matching for Natural...2017-02-13Code
31ESIM + ELMo88.7NoDeep contextualized word representations2018-02-15Code
32KIM88.6NoNeural Natural Language Inference Models Enhance...2017-11-12Code
33600D ESIM + 300D Syntactic TreeLSTM88.6NoEnhanced LSTM for Natural Language Inference2016-09-20Code
34450D DR-BiLSTM88.5NoDR-BiLSTM: Dependent Reading Bidirectional LSTM ...2018-02-15-
35Stochastic Answer Network88.5NoStochastic Answer Networks for Natural Language ...2018-04-21Code
36300D CAFE88.5NoCompare, Compress and Propagate: Enhancing Neura...2017-12-30-
37150D Multiway Attention Network88.3No--Code
38Biattentive Classification Network + CoVe + Char88.1NoLearned in Translation: Contextualized Word Vect...2017-08-01Code
39aESIM88.1NoAttention Boosted Sequential Inference Model2018-12-05-
40448D Densely Interactive Inference Network (DIIN, code)88NoNatural Language Inference over Interaction Space2017-09-13Code
41Enhanced Sequential Inference Model (Chen et al., [2017a])88NoEnhanced LSTM for Natural Language Inference2016-09-20Code
42BiMPM87.5NoBilateral Multi-Perspective Matching for Natural...2017-02-13Code
43300D re-read LSTM87.5No---
44300D re-read LSTM87.5No---
452400D Multiple-Dynamic Self-Attention Model87.4NoDynamic Self-Attention : Computing Attention ove...2018-08-22Code
46300D Full tree matching NTI-SLSTM-LSTM w/ global attention87.3NoNeural Tree Indexers for Text Understanding2016-07-15Code
47300D 2-layer Bi-CAS-LSTM87NoCell-aware Stacked LSTMs for Modeling Sentences2018-09-07-
48200D decomposable attention feed-forward model with intra-sentence attention86.8NoA Decomposable Attention Model for Natural Langu...2016-06-06Code
49200D decomposable attention model with intra-sentence attention86.8NoA Decomposable Attention Model for Natural Langu...2016-06-06Code
50600D Dynamic Self-Attention Model86.8NoDynamic Self-Attention : Computing Attention ove...2018-08-22Code
51CBS-1 + ESIM86.73NoParameter Re-Initialization through Cyclical Bat...2018-12-04-
52512D Dynamic Meta-Embeddings86.7NoDynamic Meta-Embeddings for Improved Sentence Re...2018-04-21Code
53600D BiLSTM with generalized pooling86.6NoEnhancing Sentence Embedding with Generalized Po...2018-06-26Code
54600D Hierarchical BiLSTM with Max Pooling (HBMP, code)86.6NoSentence Embeddings in NLI with Iterative Refine...2018-08-27Code
55Densely-Connected Recurrent and Co-Attentive Network (encoder)86.5NoSemantic Sentence Matching with Densely-connecte...2018-05-29-
56300D Reinforced Self-Attention Network86.3NoReinforced Self-Attention Network: a Hybrid of H...2018-01-31Code
57Distance-based Self-Attention Network86.3NoDistance-based Self-Attention Network for Natura...2017-12-06-
58200D decomposable attention feed-forward model86.3NoA Decomposable Attention Model for Natural Langu...2016-06-06Code
59200D decomposable attention model86.3NoA Decomposable Attention Model for Natural Langu...2016-06-06Code
60450D LSTMN with deep attention fusion86.3NoLong Short-Term Memory-Networks for Machine Read...2016-01-25Code
61300D mLSTM word-by-word attention model86.1NoLearning Natural Language Inference with LSTM2015-12-30Code
62600D Gumbel TreeLSTM encoders86NoLearning to Compose Task-Specific Tree Structures2017-07-10Code
63600D Residual stacked encoders86NoShortcut-Stacked Sentence Encoders for Multi-Dom...2017-08-07Code
64Star-Transformer (no cross sentence attention)86NoStar-Transformer2019-02-25Code
65300D CAFE (no cross-sentence attention)85.9NoCompare, Compress and Propagate: Enhancing Neura...2017-12-30-
661200D REGMAPR (Base+Reg)85.9No---
67300D Residual stacked encoders85.7NoShortcut-Stacked Sentence Encoders for Multi-Dom...2017-08-07Code
68300D LSTMN with deep attention fusion85.7NoLong Short-Term Memory-Networks for Machine Read...2016-01-25Code
69300D Gumbel TreeLSTM encoders85.6NoLearning to Compose Task-Specific Tree Structures2017-07-10Code
70300D Directional self-attention network encoders85.6NoDiSAN: Directional Self-Attention Network for RN...2017-09-14Code
71600D (300+300) Deep Gated Attn. BiLSTM encoders85.5NoRecurrent Neural Network-Based Sentence Encoder ...2017-08-04Code
72300D MMA-NSE encoders with attention85.4NoNeural Semantic Encoders2016-07-14Code
7350D stacked TC-LSTMs85.1NoModelling Interaction of Sentence Pair with coup...2016-05-18-
74600D (300+300) BiLSTM encoders with intra-attention and symbolic preproc.85NoLearning Natural Language Inference using Bidire...2016-05-30Code
75Stacked Bi-LSTMs (shortcut connections, max-pooling)84.8NoCombining Similarity Features and Deep Represent...2018-11-02Code
76300D NSE encoders84.6NoNeural Semantic Encoders2016-07-14Code
77100D DF-LSTM84.6No---
784096D BiLSTM with max-pooling84.5NoSupervised Learning of Universal Sentence Repres...2017-05-05Code
79Bi-LSTM sentence encoder (max-pooling)84.5NoCombining Similarity Features and Deep Represent...2018-11-02Code
80Stacked Bi-LSTMs (shortcut connections, max-pooling, attention)84.4NoCombining Similarity Features and Deep Represent...2018-11-02Code
81600D (300+300) BiLSTM encoders with intra-attention84.2NoLearning Natural Language Inference using Bidire...2016-05-30Code
82SWEM-max83.8NoBaseline Needs More Love: On Simple Word-Embeddi...2018-05-24Code
83100D LSTMs w/ word-by-word attention83.5NoReasoning about Entailment with Neural Attention2015-09-22Code
84300D NTI-SLSTM-LSTM encoders83.4NoNeural Tree Indexers for Text Understanding2016-07-15Code
85600D (300+300) BiLSTM encoders83.3NoLearning Natural Language Inference using Bidire...2016-05-30Code
86300D SPINN-PI encoders83.2NoA Fast Unified Model for Parsing and Sentence Un...2016-03-19Code
87300D Tree-based CNN encoders82.1NoNatural Language Inference by Tree-Based Convolu...2015-12-28-
881024D GRU encoders w/ unsupervised 'skip-thoughts' pre-training81.4NoOrder-Embeddings of Images and Language2015-11-19Code
89DELTA (LSTM)80.7NoDELTA: A DEep learning based Language Technology...2019-08-02Code
90300D LSTM encoders80.6NoA Fast Unified Model for Parsing and Sentence Un...2016-03-19Code
91+ Unigram and bigram features78.2NoA large annotated corpus for learning natural la...2015-08-21Code
92100D LSTM encoders77.6NoA large annotated corpus for learning natural la...2015-08-21Code
93Unlexicalized features50.4NoA large annotated corpus for learning natural la...2015-08-21Code