| 1 | T5-11B | 90.06 | Yes | Exploring the Limits of Transfer Learning with a... | 2019-10-23 | Code |
| 2 | LUKE | 89.8 | Yes | LUKE: Deep Contextualized Entity Representations... | 2020-10-02 | Code |
| 3 | XLNet+DSC | 89.79 | Yes | Dice Loss for Data-imbalanced NLP Tasks | 2019-11-07 | Code |
| 4 | XLNet (single model) | 89.7 | Yes | XLNet: Generalized Autoregressive Pretraining fo... | 2019-06-19 | Code |
| 5 | T5-3B | 88.53 | Yes | Exploring the Limits of Transfer Learning with a... | 2019-10-23 | Code |
| 6 | T5-Large 770M | 86.66 | No | Exploring the Limits of Transfer Learning with a... | 2019-10-23 | Code |
| 7 | BERT-LARGE (Ensemble+TriviaQA) | 86.2 | No | BERT: Pre-training of Deep Bidirectional Transfo... | 2018-10-11 | Code |
| 8 | T5-Base | 85.44 | Yes | Exploring the Limits of Transfer Learning with a... | 2019-10-23 | Code |
| 9 | BERT-LARGE (Single+TriviaQA) | 84.2 | No | BERT: Pre-training of Deep Bidirectional Transfo... | 2018-10-11 | Code |
| 10 | BERT-Large-uncased-PruneOFA (90% unstruct sparse) | 83.35 | No | Prune Once for All: Sparse Pre-Trained Language ... | 2021-11-10 | Code |
| 11 | BERT-Large-uncased-PruneOFA (90% unstruct sparse, QAT Int8) | 83.22 | No | Prune Once for All: Sparse Pre-Trained Language ... | 2021-11-10 | Code |
| 12 | BERT-Base-uncased-PruneOFA (85% unstruct sparse) | 81.1 | No | Prune Once for All: Sparse Pre-Trained Language ... | 2021-11-10 | Code |
| 13 | BERT-Base-uncased-PruneOFA (85% unstruct sparse, QAT Int8) | 80.84 | No | Prune Once for All: Sparse Pre-Trained Language ... | 2021-11-10 | Code |
| 14 | BERT-Base-uncased-PruneOFA (90% unstruct sparse) | 79.83 | No | Prune Once for All: Sparse Pre-Trained Language ... | 2021-11-10 | Code |
| 15 | TinyBERT-6 67M | 79.7 | No | TinyBERT: Distilling BERT for Natural Language U... | 2019-09-23 | Code |
| 16 | T5-Small | 79.1 | Yes | Exploring the Limits of Transfer Learning with a... | 2019-10-23 | Code |
| 17 | R.M-Reader (single) | 78.9 | No | Reinforced Mnemonic Reader for Machine Reading C... | 2017-05-08 | Code |
| 18 | DensePhrases | 78.3 | No | Learning Dense Representations of Phrases at Scale | 2020-12-23 | Code |
| 19 | DistilBERT-uncased-PruneOFA (85% unstruct sparse) | 78.1 | No | Prune Once for All: Sparse Pre-Trained Language ... | 2021-11-10 | Code |
| 20 | DistilBERT | 77.7 | No | DistilBERT, a distilled version of BERT: smaller... | 2019-10-02 | Code |
| 21 | DistilBERT-uncased-PruneOFA (85% unstruct sparse, QAT Int8) | 77.03 | No | Prune Once for All: Sparse Pre-Trained Language ... | 2021-11-10 | Code |
| 22 | DistilBERT-uncased-PruneOFA (90% unstruct sparse) | 76.91 | No | Prune Once for All: Sparse Pre-Trained Language ... | 2021-11-10 | Code |
| 23 | KAR | 76.7 | No | Explicit Utilization of General Knowledge in Mac... | 2018-09-10 | - |
| 24 | SAN (single) | 76.235 | No | Stochastic Answer Networks for Machine Reading C... | 2017-12-10 | Code |
| 25 | DistilBERT-uncased-PruneOFA (90% unstruct sparse, QAT Int8) | 75.62 | No | Prune Once for All: Sparse Pre-Trained Language ... | 2021-11-10 | Code |
| 26 | FusionNet | 75.3 | No | FusionNet: Fusing via Fully-Aware Attention with... | 2017-11-16 | Code |
| 27 | QANet (data aug x3) | 75.1 | No | QANet: Combining Local Convolution with Global S... | 2018-04-23 | Code |
| 28 | QANet (data aug x2) | 74.5 | No | QANet: Combining Local Convolution with Global S... | 2018-04-23 | Code |
| 29 | DCN+ (single) | 74.5 | No | DCN+: Mixed Objective and Deep Residual Coattent... | 2017-10-31 | Code |
| 30 | QANet | 73.6 | No | QANet: Combining Local Convolution with Global S... | 2018-04-23 | Code |
| 31 | PhaseCond (single) | 72.1 | No | Phase Conductor on Multi-layered Attentions for ... | 2017-10-28 | - |
| 32 | SRU | 71.4 | No | Simple Recurrent Units for Highly Parallelizable... | 2017-09-08 | Code |
| 33 | Smarnet | 71.362 | No | Smarnet: Teaching Machines to Read and Comprehen... | 2017-10-08 | - |
| 34 | DCN (Char + CoVe) | 71.3 | No | Learned in Translation: Contextualized Word Vect... | 2017-08-01 | Code |
| 35 | R-NET (single) | 71.1 | No | - | - | - |
| 36 | Ruminating Reader | 70.6 | No | Ruminating Reader: Reasoning with Gated Multi-Ho... | 2017-04-24 | - |
| 37 | FastQAExt (beam-size 5) | 70.3 | No | Making Neural QA as Simple as Possible but not S... | 2017-03-14 | Code |
| 38 | DrQA (Document Reader only) | 69.5 | No | Reading Wikipedia to Answer Open-Domain Questions | 2017-03-31 | Code |
| 39 | jNet (TreeLSTM adaptation, QTLa, K=100) | 69.1 | No | Exploring Question Understanding and Adaptation ... | 2017-03-14 | - |
| 40 | SEDT-LSTM | 67.89 | No | Structural Embedding of Syntactic Trees for Mach... | 2017-03-02 | - |
| 41 | BIDAF (single) | 67.7 | No | Bidirectional Attention Flow for Machine Compreh... | 2016-11-05 | Code |
| 42 | SECT-LSTM | 67.65 | No | Structural Embedding of Syntactic Trees for Mach... | 2017-03-02 | - |
| 43 | RASOR | 66.4 | No | Learning Recurrent Span Representations for Extr... | 2016-11-04 | Code |
| 44 | MPCM | 66.1 | No | Multi-Perspective Context Matching for Machine C... | 2016-12-13 | Code |
| 45 | DCN | 65.4 | No | Dynamic Coattention Networks For Question Answer... | 2016-11-05 | Code |
| 46 | FABIR | 65.1 | No | A Fully Attention-Based Information Retriever | 2018-10-22 | Code |
| 47 | Match-LSTM with Bi-Ans-Ptr (Boundary+Search+b) | 64.1 | No | Machine Comprehension Using Match-LSTM and Answe... | 2016-08-29 | Code |
| 48 | OTF dict+spelling (single) | 63.06 | No | Learning to Compute Word Embeddings On the Fly | 2017-06-01 | - |
| 49 | DCR | 62.5 | No | End-to-End Answer Chunk Extraction and Ranking f... | 2016-10-31 | - |
| 50 | FG fine-grained gate | 59.95 | No | Words or Characters? Fine-grained Gating for Rea... | 2016-11-06 | Code |
| 51 | SPARTA | 59.3 | No | SPARTA: Efficient Open-Domain Question Answering... | 2020-09-28 | Code |
| 52 | Blended RAG | 57.63 | No | Blended RAG: Improving RAG (Retriever-Augmented ... | 2024-03-22 | Code |
| 53 | BERTserini | 50.2 | No | Data Augmentation for BERT Fine-Tuning in Open-D... | 2019-04-14 | - |
| 54 | BERTserini | 38.6 | No | End-to-End Open-Domain Question Answering with B... | 2019-02-05 | Code |