Niklas Muennighoff
Decoder transformers have continued increasing in scale reaching hundreds of billions of parameters. Due to their scale the same decoder sets state-of-the-art results on various language tasks via prompting or fine-tuning. Yet, these large foundation models remain unusable for the related fields of semantic search and sentence embeddings. This prevents possibly new state-of-the-art results and forces organizations to train and maintain separate models. To this end, we propose SGPT to use decoders for sentence embeddings and semantic search via prompting or fine-tuning. At 5.8 billion parameters SGPT improves on the previously best sentence embeddings by a margin of 7% and outperforms a concurrent method with 175 billion parameters as measured on the BEIR search benchmark. Code, models and result files are freely available at https://github.com/Muennighoff/sgpt.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Question Answering | HotpotQA (BEIR) | nDCG@10 | 0.699 | SGPT-CE-6.1B |
| Question Answering | HotpotQA (BEIR) | nDCG@10 | 0.593 | SGPT-BE-5.8B |
| Question Answering | NQ (BEIR) | nDCG@10 | 0.524 | SGPT-BE-5.8B |
| Question Answering | NQ (BEIR) | nDCG@10 | 0.401 | SGPT-CE-6.1B |
| Question Answering | FiQA-2018 (BEIR) | nDCG@10 | 0.401 | SGPT-CE-6.1B |
| Question Answering | FiQA-2018 (BEIR) | nDCG@10 | 0.372 | SGPT-BE-5.8B |
| Information Retrieval | CQADupStack | mAP@100 | 0.16 | SGPT-BE-5.8B |
| Information Retrieval | MSMARCO (BEIR) | nDCG@10 | 0.399 | SGPT-BE-5.8B |
| Information Retrieval | MSMARCO (BEIR) | nDCG@10 | 0.29 | SGPT-CE-6.1B |
| Information Retrieval | MSMARCO (BEIR) | nDCG@10 | 0.278 | SGPT-CE-2.7B |
| Biomedical Information Retrieval | NFCorpus (BEIR) | nDCG@10 | 0.362 | SGPT-BE-5.8B |
| Biomedical Information Retrieval | NFCorpus (BEIR) | nDCG@10 | 0.358 | OpenAI Search-Davinci |
| Biomedical Information Retrieval | NFCorpus (BEIR) | nDCG@10 | 0.347 | SGPT-CE-6.1B |
| Biomedical Information Retrieval | NFCorpus (BEIR) | nDCG@10 | 0.333 | SGPT-CE-2.7B |
| Biomedical Information Retrieval | BioASQ (BEIR) | nDCG@10 | 0.547 | SGPT-CE-6.1B |
| Biomedical Information Retrieval | BioASQ (BEIR) | nDCG@10 | 0.546 | SGPT-CE-2.7B |
| Biomedical Information Retrieval | BioASQ (BEIR) | nDCG@10 | 0.413 | SGPT-BE-5.8B |
| Biomedical Information Retrieval | TREC-COVID (BEIR) | nDCG@10 | 0.873 | SGPT-BE-5.8B |
| Biomedical Information Retrieval | TREC-COVID (BEIR) | nDCG@10 | 0.791 | SGPT-CE-6.1B |
| Biomedical Information Retrieval | TREC-COVID (BEIR) | nDCG@10 | 0.762 | SGPT-CE-2.7B |
| Fact Checking | CLIMATE-FEVER (BEIR) | nDCG@10 | 0.305 | SGPT-BE-5.8B |
| Fact Checking | CLIMATE-FEVER (BEIR) | nDCG@10 | 0.161 | SGPT-CE-6.1B |
| Fact Checking | FEVER (BEIR) | nDCG@10 | 0.783 | SGPT-BE-5.8B |
| Fact Checking | FEVER (BEIR) | nDCG@10 | 0.725 | SGPT-CE-6.1B |
| Fact Checking | SciFact (BEIR) | nDCG@10 | 0.747 | SGPT-BE-5.8B |
| Fact Checking | SciFact (BEIR) | nDCG@10 | 0.682 | SGPT-CE-6.1B |