Metric: % Train Accuracy (higher is better)
| # | Model↕ | % Train Accuracy▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | + Unigram and bigram features | 99.7 | No | A large annotated corpus for learning natural la... | 2015-08-21 | Code |
| 2 | Ntumpha | 99.1 | No | Multi-Task Deep Neural Networks for Natural Lang... | 2019-01-31 | Code |
| 3 | 1024D GRU encoders w/ unsupervised 'skip-thoughts' pre-training | 98.8 | No | Order-Embeddings of Images and Language | 2015-11-19 | Code |
| 4 | MT-DNN | 97.2 | No | Multi-Task Deep Neural Networks for Natural Lang... | 2019-01-31 | Code |
| 5 | Fine-Tuned LM-Pretrained Transformer | 96.6 | No | - | - | Code |
| 6 | 300D DMAN Ensemble | 96.1 | No | Discourse Marker Augmented Network with Reinforc... | 2019-07-23 | Code |
| 7 | 300D DMAN Ensemble | 96.1 | No | Discourse Marker Augmented Network with Reinforc... | 2019-07-23 | Code |
| 8 | SJRC (BERT-Large +SRL) | 95.7 | No | Explicit Contextual Semantics for Text Comprehen... | 2018-09-08 | - |
| 9 | 150D Multiway Attention Network Ensemble | 95.5 | No | - | - | Code |
| 10 | 300D DMAN | 95.4 | No | Discourse Marker Augmented Network with Reinforc... | 2019-07-23 | Code |
| 11 | 300D DMAN | 95.4 | No | Discourse Marker Augmented Network with Reinforc... | 2019-07-23 | Code |
| 12 | Densely-Connected Recurrent and Co-Attentive Network Ensemble | 95 | No | Semantic Sentence Matching with Densely-connecte... | 2018-05-29 | - |
| 13 | 600D BiLSTM with generalized pooling | 94.9 | No | Enhancing Sentence Embedding with Generalized Po... | 2018-06-26 | Code |
| 14 | 450D DR-BiLSTM Ensemble | 94.8 | No | DR-BiLSTM: Dependent Reading Bidirectional LSTM ... | 2018-02-15 | - |
| 15 | 150D Multiway Attention Network | 94.5 | No | - | - | Code |
| 16 | SemBERT | 94.4 | No | Semantics-aware BERT for Language Understanding | 2019-09-05 | Code |
| 17 | KIM | 94.1 | No | Neural Natural Language Inference Models Enhance... | 2017-11-12 | Code |
| 18 | 450D DR-BiLSTM | 94.1 | No | DR-BiLSTM: Dependent Reading Bidirectional LSTM ... | 2018-02-15 | - |
| 19 | RE2 | 94 | No | Simple and Effective Text Matching with Richer A... | 2019-08-01 | Code |
| 20 | KIM Ensemble | 93.6 | No | Neural Natural Language Inference Models Enhance... | 2017-11-12 | Code |
| 21 | 600D ESIM + 300D Syntactic TreeLSTM | 93.5 | No | Enhanced LSTM for Natural Language Inference | 2016-09-20 | Code |
| 22 | Stochastic Answer Network | 93.3 | No | Stochastic Answer Networks for Natural Language ... | 2018-04-21 | Code |
| 23 | BiMPM Ensemble | 93.2 | No | Bilateral Multi-Perspective Matching for Natural... | 2017-02-13 | Code |
| 24 | MFAE | 93.18 | No | - | - | Code |
| 25 | Densely-Connected Recurrent and Co-Attentive Network | 93.1 | No | Semantic Sentence Matching with Densely-connecte... | 2018-05-29 | - |
| 26 | 600D Gumbel TreeLSTM encoders | 93.1 | No | Learning to Compose Task-Specific Tree Structures | 2017-07-10 | Code |
| 27 | CA-MTL | 92.6 | No | Conditionally Adaptive Multi-Task Learning: Impr... | 2020-09-19 | Code |
| 28 | DEIM | 92.6 | No | DEIM: An effective deep encoding and interaction... | 2022-03-20 | - |
| 29 | 300D Reinforced Self-Attention Network | 92.6 | No | Reinforced Self-Attention Network: a Hybrid of H... | 2018-01-31 | Code |
| 30 | 300D CAFE Ensemble | 92.5 | No | Compare, Compress and Propagate: Enhancing Neura... | 2017-12-30 | - |
| 31 | 448D Densely Interactive Inference Network (DIIN, code) Ensemble | 92.3 | No | Natural Language Inference over Interaction Space | 2017-09-13 | Code |
| 32 | ESIM + ELMo Ensemble | 92.1 | No | Deep contextualized word representations | 2018-02-15 | Code |
| 33 | 300D mLSTM word-by-word attention model | 92 | No | Learning Natural Language Inference with LSTM | 2015-12-30 | Code |
| 34 | ESIM + ELMo | 91.6 | No | Deep contextualized word representations | 2018-02-15 | Code |
| 35 | 512D Dynamic Meta-Embeddings | 91.6 | No | Dynamic Meta-Embeddings for Improved Sentence Re... | 2018-04-21 | Code |
| 36 | Densely-Connected Recurrent and Co-Attentive Network (encoder) | 91.4 | No | Semantic Sentence Matching with Densely-connecte... | 2018-05-29 | - |
| 37 | 448D Densely Interactive Inference Network (DIIN, code) | 91.2 | No | Natural Language Inference over Interaction Space | 2017-09-13 | Code |
| 38 | 300D Gumbel TreeLSTM encoders | 91.2 | No | Learning to Compose Task-Specific Tree Structures | 2017-07-10 | Code |
| 39 | 300D Directional self-attention network encoders | 91.1 | No | DiSAN: Directional Self-Attention Network for RN... | 2017-09-14 | Code |
| 40 | 600D Residual stacked encoders | 91 | No | Shortcut-Stacked Sentence Encoders for Multi-Dom... | 2017-08-07 | Code |
| 41 | BiMPM | 90.9 | No | Bilateral Multi-Perspective Matching for Natural... | 2017-02-13 | Code |
| 42 | 300D re-read LSTM | 90.7 | No | - | - | - |
| 43 | 300D re-read LSTM | 90.7 | No | - | - | - |
| 44 | 200D decomposable attention feed-forward model with intra-sentence attention | 90.5 | No | A Decomposable Attention Model for Natural Langu... | 2016-06-06 | Code |
| 45 | 200D decomposable attention model with intra-sentence attention | 90.5 | No | A Decomposable Attention Model for Natural Langu... | 2016-06-06 | Code |
| 46 | 600D (300+300) Deep Gated Attn. BiLSTM encoders | 90.5 | No | Recurrent Neural Network-Based Sentence Encoder ... | 2017-08-04 | Code |
| 47 | 600D Hierarchical BiLSTM with Max Pooling (HBMP, code) | 89.9 | No | Sentence Embeddings in NLI with Iterative Refine... | 2018-08-27 | Code |
| 48 | 300D CAFE | 89.8 | No | Compare, Compress and Propagate: Enhancing Neura... | 2017-12-30 | - |
| 49 | 300D Residual stacked encoders | 89.8 | No | Shortcut-Stacked Sentence Encoders for Multi-Dom... | 2017-08-07 | Code |
| 50 | Distance-based Self-Attention Network | 89.6 | No | Distance-based Self-Attention Network for Natura... | 2017-12-06 | - |
| 51 | 200D decomposable attention feed-forward model | 89.5 | No | A Decomposable Attention Model for Natural Langu... | 2016-06-06 | Code |
| 52 | 200D decomposable attention model | 89.5 | No | A Decomposable Attention Model for Natural Langu... | 2016-06-06 | Code |
| 53 | 300D SPINN-PI encoders | 89.2 | No | A Fast Unified Model for Parsing and Sentence Un... | 2016-03-19 | Code |
| 54 | SLRC | 89.1 | No | Explicit Contextual Semantics for Text Comprehen... | 2018-09-08 | - |
| 55 | 2400D Multiple-Dynamic Self-Attention Model | 89 | No | Dynamic Self-Attention : Computing Attention ove... | 2018-08-22 | Code |
| 56 | Biattentive Classification Network + CoVe + Char | 88.5 | No | Learned in Translation: Contextualized Word Vect... | 2017-08-01 | Code |
| 57 | 300D Full tree matching NTI-SLSTM-LSTM w/ global attention | 88.5 | No | Neural Tree Indexers for Text Understanding | 2016-07-15 | Code |
| 58 | 450D LSTMN with deep attention fusion | 88.5 | No | Long Short-Term Memory-Networks for Machine Read... | 2016-01-25 | Code |
| 59 | 600D Dynamic Self-Attention Model | 87.3 | No | Dynamic Self-Attention : Computing Attention ove... | 2018-08-22 | Code |
| 60 | 300D CAFE (no cross-sentence attention) | 87.3 | No | Compare, Compress and Propagate: Enhancing Neura... | 2017-12-30 | - |
| 61 | 300D LSTMN with deep attention fusion | 87.3 | No | Long Short-Term Memory-Networks for Machine Read... | 2016-01-25 | Code |
| 62 | 300D MMA-NSE encoders with attention | 86.9 | No | Neural Semantic Encoders | 2016-07-14 | Code |
| 63 | 50D stacked TC-LSTMs | 86.7 | No | Modelling Interaction of Sentence Pair with coup... | 2016-05-18 | - |
| 64 | 600D (300+300) BiLSTM encoders | 86.4 | No | Learning Natural Language Inference using Bidire... | 2016-05-30 | Code |
| 65 | 300D NSE encoders | 86.2 | No | Neural Semantic Encoders | 2016-07-14 | Code |
| 66 | 600D (300+300) BiLSTM encoders with intra-attention and symbolic preproc. | 85.9 | No | Learning Natural Language Inference using Bidire... | 2016-05-30 | Code |
| 67 | 4096D BiLSTM with max-pooling | 85.6 | No | Supervised Learning of Universal Sentence Repres... | 2017-05-05 | Code |
| 68 | 100D LSTMs w/ word-by-word attention | 85.3 | No | Reasoning about Entailment with Neural Attention | 2015-09-22 | Code |
| 69 | 100D DF-LSTM | 85.2 | No | - | - | - |
| 70 | 100D LSTM encoders | 84.8 | No | A large annotated corpus for learning natural la... | 2015-08-21 | Code |
| 71 | 600D (300+300) BiLSTM encoders with intra-attention | 84.5 | No | Learning Natural Language Inference using Bidire... | 2016-05-30 | Code |
| 72 | 300D LSTM encoders | 83.9 | No | A Fast Unified Model for Parsing and Sentence Un... | 2016-03-19 | Code |
| 73 | 300D Tree-based CNN encoders | 83.3 | No | Natural Language Inference by Tree-Based Convolu... | 2015-12-28 | - |
| 74 | 300D NTI-SLSTM-LSTM encoders | 82.5 | No | Neural Tree Indexers for Text Understanding | 2016-07-15 | Code |
| 75 | Unlexicalized features | 49.4 | No | A large annotated corpus for learning natural la... | 2015-08-21 | Code |