Jinhyuk Lee, Mujeen Sung, Jaewoo Kang, Danqi Chen
Open-domain question answering can be reformulated as a phrase retrieval problem, without the need for processing documents on-demand during inference (Seo et al., 2019). However, current phrase retrieval models heavily depend on sparse representations and still underperform retriever-reader approaches. In this work, we show for the first time that we can learn dense representations of phrases alone that achieve much stronger performance in open-domain QA. We present an effective method to learn phrase representations from the supervision of reading comprehension tasks, coupled with novel negative sampling methods. We also propose a query-side fine-tuning strategy, which can support transfer learning and reduce the discrepancy between training and inference. On five popular open-domain QA datasets, our model DensePhrases improves over previous phrase retrieval models by 15%-25% absolute accuracy and matches the performance of state-of-the-art retriever-reader models. Our model is easy to parallelize due to pure dense representations and processes more than 10 questions per second on CPUs. Finally, we directly use our pre-indexed dense phrase representations for two slot filling tasks, showing the promise of utilizing DensePhrases as a dense knowledge base for downstream tasks.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Question Answering | SQuAD1.1 dev | EM | 78.3 | DensePhrases |
| Question Answering | SQuAD1.1 dev | F1 | 86.3 | DensePhrases |
| Question Answering | Natural Questions (long) | EM | 71.9 | DensePhrases |
| Question Answering | Natural Questions (long) | F1 | 79.6 | DensePhrases |
| Slot Filling | KILT: T-REx | Accuracy | 53.9 | DensePhrases |
| Slot Filling | KILT: T-REx | F1 | 61.74 | DensePhrases |
| Slot Filling | KILT: T-REx | KILT-AC | 27.84 | DensePhrases |
| Slot Filling | KILT: T-REx | KILT-F1 | 32.34 | DensePhrases |
| Slot Filling | KILT: T-REx | R-Prec | 37.62 | DensePhrases |
| Slot Filling | KILT: T-REx | Recall@5 | 40.07 | DensePhrases |
| Slot Filling | KILT: Zero Shot RE | Accuracy | 47.42 | DensePhrases |
| Slot Filling | KILT: Zero Shot RE | F1 | 54.75 | DensePhrases |
| Slot Filling | KILT: Zero Shot RE | KILT-AC | 41.34 | DensePhrases |
| Slot Filling | KILT: Zero Shot RE | KILT-F1 | 46.79 | DensePhrases |
| Slot Filling | KILT: Zero Shot RE | R-Prec | 57.43 | DensePhrases |
| Slot Filling | KILT: Zero Shot RE | Recall@5 | 60.47 | DensePhrases |