Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Natural Language Inference
/
SNLI
Natural Language Inference on SNLI
Metric: % Train Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
% Train Accuracy (best first)
% Train Accuracy (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
% Train Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
+ Unigram and bigram features
99.7
No
A large annotated corpus for learning natural la...
2015-08-21
Code
2
Ntumpha
99.1
No
Multi-Task Deep Neural Networks for Natural Lang...
2019-01-31
Code
3
1024D GRU encoders w/ unsupervised 'skip-thoughts' pre-training
98.8
No
Order-Embeddings of Images and Language
2015-11-19
Code
4
MT-DNN
97.2
No
Multi-Task Deep Neural Networks for Natural Lang...
2019-01-31
Code
5
Fine-Tuned LM-Pretrained Transformer
96.6
No
-
-
Code
6
300D DMAN Ensemble
96.1
No
Discourse Marker Augmented Network with Reinforc...
2019-07-23
Code
7
300D DMAN Ensemble
96.1
No
Discourse Marker Augmented Network with Reinforc...
2019-07-23
Code
8
SJRC (BERT-Large +SRL)
95.7
No
Explicit Contextual Semantics for Text Comprehen...
2018-09-08
-
9
150D Multiway Attention Network Ensemble
95.5
No
-
-
Code
10
300D DMAN
95.4
No
Discourse Marker Augmented Network with Reinforc...
2019-07-23
Code
11
300D DMAN
95.4
No
Discourse Marker Augmented Network with Reinforc...
2019-07-23
Code
12
Densely-Connected Recurrent and Co-Attentive Network Ensemble
95
No
Semantic Sentence Matching with Densely-connecte...
2018-05-29
-
13
600D BiLSTM with generalized pooling
94.9
No
Enhancing Sentence Embedding with Generalized Po...
2018-06-26
Code
14
450D DR-BiLSTM Ensemble
94.8
No
DR-BiLSTM: Dependent Reading Bidirectional LSTM ...
2018-02-15
-
15
150D Multiway Attention Network
94.5
No
-
-
Code
16
SemBERT
94.4
No
Semantics-aware BERT for Language Understanding
2019-09-05
Code
17
KIM
94.1
No
Neural Natural Language Inference Models Enhance...
2017-11-12
Code
18
450D DR-BiLSTM
94.1
No
DR-BiLSTM: Dependent Reading Bidirectional LSTM ...
2018-02-15
-
19
RE2
94
No
Simple and Effective Text Matching with Richer A...
2019-08-01
Code
20
KIM Ensemble
93.6
No
Neural Natural Language Inference Models Enhance...
2017-11-12
Code
21
600D ESIM + 300D Syntactic TreeLSTM
93.5
No
Enhanced LSTM for Natural Language Inference
2016-09-20
Code
22
Stochastic Answer Network
93.3
No
Stochastic Answer Networks for Natural Language ...
2018-04-21
Code
23
BiMPM Ensemble
93.2
No
Bilateral Multi-Perspective Matching for Natural...
2017-02-13
Code
24
MFAE
93.18
No
-
-
Code
25
Densely-Connected Recurrent and Co-Attentive Network
93.1
No
Semantic Sentence Matching with Densely-connecte...
2018-05-29
-
26
600D Gumbel TreeLSTM encoders
93.1
No
Learning to Compose Task-Specific Tree Structures
2017-07-10
Code
27
CA-MTL
92.6
No
Conditionally Adaptive Multi-Task Learning: Impr...
2020-09-19
Code
28
DEIM
92.6
No
DEIM: An effective deep encoding and interaction...
2022-03-20
-
29
300D Reinforced Self-Attention Network
92.6
No
Reinforced Self-Attention Network: a Hybrid of H...
2018-01-31
Code
30
300D CAFE Ensemble
92.5
No
Compare, Compress and Propagate: Enhancing Neura...
2017-12-30
-
31
448D Densely Interactive Inference Network (DIIN, code) Ensemble
92.3
No
Natural Language Inference over Interaction Space
2017-09-13
Code
32
ESIM + ELMo Ensemble
92.1
No
Deep contextualized word representations
2018-02-15
Code
33
300D mLSTM word-by-word attention model
92
No
Learning Natural Language Inference with LSTM
2015-12-30
Code
34
ESIM + ELMo
91.6
No
Deep contextualized word representations
2018-02-15
Code
35
512D Dynamic Meta-Embeddings
91.6
No
Dynamic Meta-Embeddings for Improved Sentence Re...
2018-04-21
Code
36
Densely-Connected Recurrent and Co-Attentive Network (encoder)
91.4
No
Semantic Sentence Matching with Densely-connecte...
2018-05-29
-
37
448D Densely Interactive Inference Network (DIIN, code)
91.2
No
Natural Language Inference over Interaction Space
2017-09-13
Code
38
300D Gumbel TreeLSTM encoders
91.2
No
Learning to Compose Task-Specific Tree Structures
2017-07-10
Code
39
300D Directional self-attention network encoders
91.1
No
DiSAN: Directional Self-Attention Network for RN...
2017-09-14
Code
40
600D Residual stacked encoders
91
No
Shortcut-Stacked Sentence Encoders for Multi-Dom...
2017-08-07
Code
41
BiMPM
90.9
No
Bilateral Multi-Perspective Matching for Natural...
2017-02-13
Code
42
300D re-read LSTM
90.7
No
-
-
-
43
300D re-read LSTM
90.7
No
-
-
-
44
200D decomposable attention feed-forward model with intra-sentence attention
90.5
No
A Decomposable Attention Model for Natural Langu...
2016-06-06
Code
45
200D decomposable attention model with intra-sentence attention
90.5
No
A Decomposable Attention Model for Natural Langu...
2016-06-06
Code
46
600D (300+300) Deep Gated Attn. BiLSTM encoders
90.5
No
Recurrent Neural Network-Based Sentence Encoder ...
2017-08-04
Code
47
600D Hierarchical BiLSTM with Max Pooling (HBMP, code)
89.9
No
Sentence Embeddings in NLI with Iterative Refine...
2018-08-27
Code
48
300D CAFE
89.8
No
Compare, Compress and Propagate: Enhancing Neura...
2017-12-30
-
49
300D Residual stacked encoders
89.8
No
Shortcut-Stacked Sentence Encoders for Multi-Dom...
2017-08-07
Code
50
Distance-based Self-Attention Network
89.6
No
Distance-based Self-Attention Network for Natura...
2017-12-06
-
51
200D decomposable attention feed-forward model
89.5
No
A Decomposable Attention Model for Natural Langu...
2016-06-06
Code
52
200D decomposable attention model
89.5
No
A Decomposable Attention Model for Natural Langu...
2016-06-06
Code
53
300D SPINN-PI encoders
89.2
No
A Fast Unified Model for Parsing and Sentence Un...
2016-03-19
Code
54
SLRC
89.1
No
Explicit Contextual Semantics for Text Comprehen...
2018-09-08
-
55
2400D Multiple-Dynamic Self-Attention Model
89
No
Dynamic Self-Attention : Computing Attention ove...
2018-08-22
Code
56
Biattentive Classification Network + CoVe + Char
88.5
No
Learned in Translation: Contextualized Word Vect...
2017-08-01
Code
57
300D Full tree matching NTI-SLSTM-LSTM w/ global attention
88.5
No
Neural Tree Indexers for Text Understanding
2016-07-15
Code
58
450D LSTMN with deep attention fusion
88.5
No
Long Short-Term Memory-Networks for Machine Read...
2016-01-25
Code
59
600D Dynamic Self-Attention Model
87.3
No
Dynamic Self-Attention : Computing Attention ove...
2018-08-22
Code
60
300D CAFE (no cross-sentence attention)
87.3
No
Compare, Compress and Propagate: Enhancing Neura...
2017-12-30
-
61
300D LSTMN with deep attention fusion
87.3
No
Long Short-Term Memory-Networks for Machine Read...
2016-01-25
Code
62
300D MMA-NSE encoders with attention
86.9
No
Neural Semantic Encoders
2016-07-14
Code
63
50D stacked TC-LSTMs
86.7
No
Modelling Interaction of Sentence Pair with coup...
2016-05-18
-
64
600D (300+300) BiLSTM encoders
86.4
No
Learning Natural Language Inference using Bidire...
2016-05-30
Code
65
300D NSE encoders
86.2
No
Neural Semantic Encoders
2016-07-14
Code
66
600D (300+300) BiLSTM encoders with intra-attention and symbolic preproc.
85.9
No
Learning Natural Language Inference using Bidire...
2016-05-30
Code
67
4096D BiLSTM with max-pooling
85.6
No
Supervised Learning of Universal Sentence Repres...
2017-05-05
Code
68
100D LSTMs w/ word-by-word attention
85.3
No
Reasoning about Entailment with Neural Attention
2015-09-22
Code
69
100D DF-LSTM
85.2
No
-
-
-
70
100D LSTM encoders
84.8
No
A large annotated corpus for learning natural la...
2015-08-21
Code
71
600D (300+300) BiLSTM encoders with intra-attention
84.5
No
Learning Natural Language Inference using Bidire...
2016-05-30
Code
72
300D LSTM encoders
83.9
No
A Fast Unified Model for Parsing and Sentence Un...
2016-03-19
Code
73
300D Tree-based CNN encoders
83.3
No
Natural Language Inference by Tree-Based Convolu...
2015-12-28
-
74
300D NTI-SLSTM-LSTM encoders
82.5
No
Neural Tree Indexers for Text Understanding
2016-07-15
Code
75
Unlexicalized features
49.4
No
A large annotated corpus for learning natural la...
2015-08-21
Code
#1
+ Unigram and bigram features
SOTA
99.7
% Train Accuracy
· 2015-08-21
A large annotated corpus for learning natural language inference
Code
#2
Ntumpha
99.1
% Train Accuracy
· 2019-01-31
Multi-Task Deep Neural Networks for Natural Language Understanding
Code
#3
1024D GRU encoders w/ unsupervised 'skip-thoughts' pre-training
98.8
% Train Accuracy
· 2015-11-19
Order-Embeddings of Images and Language
Code
#4
MT-DNN
97.2
% Train Accuracy
· 2019-01-31
Multi-Task Deep Neural Networks for Natural Language Understanding
Code
#5
Fine-Tuned LM-Pretrained Transformer
96.6
% Train Accuracy
No paper
Code
#6
300D DMAN Ensemble
96.1
% Train Accuracy
· 2019-07-23
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference
Code
#7
300D DMAN Ensemble
96.1
% Train Accuracy
· 2019-07-23
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference
Code
#8
SJRC (BERT-Large +SRL)
95.7
% Train Accuracy
· 2018-09-08
Explicit Contextual Semantics for Text Comprehension
#9
150D Multiway Attention Network Ensemble
95.5
% Train Accuracy
No paper
Code
#10
300D DMAN
95.4
% Train Accuracy
· 2019-07-23
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference
Code
#11
300D DMAN
95.4
% Train Accuracy
· 2019-07-23
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference
Code
#12
Densely-Connected Recurrent and Co-Attentive Network Ensemble
95
% Train Accuracy
· 2018-05-29
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information
#13
600D BiLSTM with generalized pooling
94.9
% Train Accuracy
· 2018-06-26
Enhancing Sentence Embedding with Generalized Pooling
Code
#14
450D DR-BiLSTM Ensemble
94.8
% Train Accuracy
· 2018-02-15
DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference
#15
150D Multiway Attention Network
94.5
% Train Accuracy
No paper
Code
#16
SemBERT
94.4
% Train Accuracy
· 2019-09-05
Semantics-aware BERT for Language Understanding
Code
#17
KIM
94.1
% Train Accuracy
· 2017-11-12
Neural Natural Language Inference Models Enhanced with External Knowledge
Code
#18
450D DR-BiLSTM
94.1
% Train Accuracy
· 2018-02-15
DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference
#19
RE2
94
% Train Accuracy
· 2019-08-01
Simple and Effective Text Matching with Richer Alignment Features
Code
#20
KIM Ensemble
93.6
% Train Accuracy
· 2017-11-12
Neural Natural Language Inference Models Enhanced with External Knowledge
Code
#21
600D ESIM + 300D Syntactic TreeLSTM
93.5
% Train Accuracy
· 2016-09-20
Enhanced LSTM for Natural Language Inference
Code
#22
Stochastic Answer Network
93.3
% Train Accuracy
· 2018-04-21
Stochastic Answer Networks for Natural Language Inference
Code
#23
BiMPM Ensemble
93.2
% Train Accuracy
· 2017-02-13
Bilateral Multi-Perspective Matching for Natural Language Sentences
Code
#24
MFAE
93.18
% Train Accuracy
No paper
Code
#25
Densely-Connected Recurrent and Co-Attentive Network
93.1
% Train Accuracy
· 2018-05-29
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information
#26
600D Gumbel TreeLSTM encoders
93.1
% Train Accuracy
· 2017-07-10
Learning to Compose Task-Specific Tree Structures
Code
#27
CA-MTL
92.6
% Train Accuracy
· 2020-09-19
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data
Code
#28
DEIM
92.6
% Train Accuracy
· 2022-03-20
DEIM: An effective deep encoding and interaction model for sentence matching
#29
300D Reinforced Self-Attention Network
92.6
% Train Accuracy
· 2018-01-31
Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling
Code
#30
300D CAFE Ensemble
92.5
% Train Accuracy
· 2017-12-30
Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorization for Natural Language Inference
#31
448D Densely Interactive Inference Network (DIIN, code) Ensemble
92.3
% Train Accuracy
· 2017-09-13
Natural Language Inference over Interaction Space
Code
#32
ESIM + ELMo Ensemble
92.1
% Train Accuracy
· 2018-02-15
Deep contextualized word representations
Code
#33
300D mLSTM word-by-word attention model
92
% Train Accuracy
· 2015-12-30
Learning Natural Language Inference with LSTM
Code
#34
ESIM + ELMo
91.6
% Train Accuracy
· 2018-02-15
Deep contextualized word representations
Code
#35
512D Dynamic Meta-Embeddings
91.6
% Train Accuracy
· 2018-04-21
Dynamic Meta-Embeddings for Improved Sentence Representations
Code
#36
Densely-Connected Recurrent and Co-Attentive Network (encoder)
91.4
% Train Accuracy
· 2018-05-29
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information
#37
448D Densely Interactive Inference Network (DIIN, code)
91.2
% Train Accuracy
· 2017-09-13
Natural Language Inference over Interaction Space
Code
#38
300D Gumbel TreeLSTM encoders
91.2
% Train Accuracy
· 2017-07-10
Learning to Compose Task-Specific Tree Structures
Code
#39
300D Directional self-attention network encoders
91.1
% Train Accuracy
· 2017-09-14
DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding
Code
#40
600D Residual stacked encoders
91
% Train Accuracy
· 2017-08-07
Shortcut-Stacked Sentence Encoders for Multi-Domain Inference
Code
#41
BiMPM
90.9
% Train Accuracy
· 2017-02-13
Bilateral Multi-Perspective Matching for Natural Language Sentences
Code
#42
300D re-read LSTM
90.7
% Train Accuracy
No paper
#43
300D re-read LSTM
90.7
% Train Accuracy
No paper
#44
200D decomposable attention feed-forward model with intra-sentence attention
90.5
% Train Accuracy
· 2016-06-06
A Decomposable Attention Model for Natural Language Inference
Code
#45
200D decomposable attention model with intra-sentence attention
90.5
% Train Accuracy
· 2016-06-06
A Decomposable Attention Model for Natural Language Inference
Code
#46
600D (300+300) Deep Gated Attn. BiLSTM encoders
90.5
% Train Accuracy
· 2017-08-04
Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference
Code
#47
600D Hierarchical BiLSTM with Max Pooling (HBMP, code)
89.9
% Train Accuracy
· 2018-08-27
Sentence Embeddings in NLI with Iterative Refinement Encoders
Code
#48
300D CAFE
89.8
% Train Accuracy
· 2017-12-30
Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorization for Natural Language Inference
#49
300D Residual stacked encoders
89.8
% Train Accuracy
· 2017-08-07
Shortcut-Stacked Sentence Encoders for Multi-Domain Inference
Code
#50
Distance-based Self-Attention Network
89.6
% Train Accuracy
· 2017-12-06
Distance-based Self-Attention Network for Natural Language Inference
#51
200D decomposable attention feed-forward model
89.5
% Train Accuracy
· 2016-06-06
A Decomposable Attention Model for Natural Language Inference
Code
#52
200D decomposable attention model
89.5
% Train Accuracy
· 2016-06-06
A Decomposable Attention Model for Natural Language Inference
Code
#53
300D SPINN-PI encoders
89.2
% Train Accuracy
· 2016-03-19
A Fast Unified Model for Parsing and Sentence Understanding
Code
#54
SLRC
89.1
% Train Accuracy
· 2018-09-08
Explicit Contextual Semantics for Text Comprehension
#55
2400D Multiple-Dynamic Self-Attention Model
89
% Train Accuracy
· 2018-08-22
Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding
Code
#56
Biattentive Classification Network + CoVe + Char
88.5
% Train Accuracy
· 2017-08-01
Learned in Translation: Contextualized Word Vectors
Code
#57
300D Full tree matching NTI-SLSTM-LSTM w/ global attention
88.5
% Train Accuracy
· 2016-07-15
Neural Tree Indexers for Text Understanding
Code
#58
450D LSTMN with deep attention fusion
88.5
% Train Accuracy
· 2016-01-25
Long Short-Term Memory-Networks for Machine Reading
Code
#59
600D Dynamic Self-Attention Model
87.3
% Train Accuracy
· 2018-08-22
Dynamic Self-Attention : Computing Attention over Words Dynamically for Sentence Embedding
Code
#60
300D CAFE (no cross-sentence attention)
87.3
% Train Accuracy
· 2017-12-30
Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorization for Natural Language Inference
#61
300D LSTMN with deep attention fusion
87.3
% Train Accuracy
· 2016-01-25
Long Short-Term Memory-Networks for Machine Reading
Code
#62
300D MMA-NSE encoders with attention
86.9
% Train Accuracy
· 2016-07-14
Neural Semantic Encoders
Code
#63
50D stacked TC-LSTMs
86.7
% Train Accuracy
· 2016-05-18
Modelling Interaction of Sentence Pair with coupled-LSTMs
#64
600D (300+300) BiLSTM encoders
86.4
% Train Accuracy
· 2016-05-30
Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention
Code
#65
300D NSE encoders
86.2
% Train Accuracy
· 2016-07-14
Neural Semantic Encoders
Code
#66
600D (300+300) BiLSTM encoders with intra-attention and symbolic preproc.
85.9
% Train Accuracy
· 2016-05-30
Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention
Code
#67
4096D BiLSTM with max-pooling
85.6
% Train Accuracy
· 2017-05-05
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
Code
#68
100D LSTMs w/ word-by-word attention
85.3
% Train Accuracy
· 2015-09-22
Reasoning about Entailment with Neural Attention
Code
#69
100D DF-LSTM
85.2
% Train Accuracy
No paper
#70
100D LSTM encoders
84.8
% Train Accuracy
· 2015-08-21
A large annotated corpus for learning natural language inference
Code
#71
600D (300+300) BiLSTM encoders with intra-attention
84.5
% Train Accuracy
· 2016-05-30
Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention
Code
#72
300D LSTM encoders
83.9
% Train Accuracy
· 2016-03-19
A Fast Unified Model for Parsing and Sentence Understanding
Code
#73
300D Tree-based CNN encoders
83.3
% Train Accuracy
· 2015-12-28
Natural Language Inference by Tree-Based Convolution and Heuristic Matching
#74
300D NTI-SLSTM-LSTM encoders
82.5
% Train Accuracy
· 2016-07-15
Neural Tree Indexers for Text Understanding
Code
#75
Unlexicalized features
49.4
% Train Accuracy
· 2015-08-21
A large annotated corpus for learning natural language inference
Code