Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/BM25

BM25

Reported on 30 benchmarks across 7 tasks · 7 papers · 13 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing20 results

Entity ResolutiononAbt-Buy
Candidate Set Size· 2023-03-06
8000
best: 54500 (Sparkly k=50)
SOTA
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Entity ResolutiononWDC Block - medium
Candidate Set Size· 2023-03-06
500000
SOTA
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Entity ResolutiononWDC Block - medium
Recall· 2023-03-06
97.8
SOTA
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Entity ResolutiononWDC Block - large
Candidate Set Size· 2023-03-06
20000000
SOTA
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Entity ResolutiononWDC Block - large
Recall· 2023-03-06
95.5
SOTA
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Entity ResolutiononWDC Block - small
Candidate Set Size· 2023-03-06
250000
SOTA
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Information RetrievalonEntityQuestions
Recall@20· 2021-09-17
0.72
best: 0.838 (TOME-2)
SOTA
Simple Entity-Centric Questions Challenge Dense Retrievers arXiv:2109.08535
Information RetrievalonBEIR
NCDG@10· 2023-05-23
44.48
best: 50.43 ($\ell_0$ Mask)
BM25 Query Augmentation Learned End-to-End arXiv:2305.14087
Entity ResolutiononAbt-Buy
Recall· 2023-03-06
94.7
best: 99.5 (SC-Block)
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Entity ResolutiononAmazon-Google
Candidate Set Size· 2023-03-06
40000
best: 165900 (Sparkly k=50)
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Entity ResolutiononAmazon-Google
Recall· 2023-03-06
98.7
best: 99.6 (SC-Block)
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Cross-LingualonSV-Ident
mAP@10· 2022-09-19
9.43
best: 18.93 (sentence-transformers/distiluse-base-multilingual-cased-v1)
Overview of the SV-Ident 2022 Shared Task on Survey Variable Identification in Social Science Publications arXiv:2209.09062
Cross-Lingual Entity LinkingonSV-Ident
mAP@10· 2022-09-19
9.43
best: 18.93 (sentence-transformers/distiluse-base-multilingual-cased-v1)
Overview of the SV-Ident 2022 Shared Task on Survey Variable Identification in Social Science Publications arXiv:2209.09062
Passage RankingonMS MARCO
MRR@10· 2022-01-24
18.4
best: 44.3 (Fine-tuned SOTA)
Text and Code Embeddings by Contrastive Pre-Training arXiv:2201.10005
Information RetrievalonBSARD
Recall@100· 2021-08-26
51.33
best: 74.78 (Two-tower Bi-Encoder (RoBERTa))
A Statutory Article Retrieval Dataset in French arXiv:2108.11792
Information RetrievalonBSARD
Recall@200· 2021-08-26
56.78
best: 78.38 (Siamese Bi-Encoder (RoBERTa))
A Statutory Article Retrieval Dataset in French arXiv:2108.11792
Information RetrievalonBSARD
Recall@500· 2021-08-26
64.71
best: 83.77 (Siamese Bi-Encoder (RoBERTa))
A Statutory Article Retrieval Dataset in French arXiv:2108.11792
Information RetrievalonMSMARCO (BEIR)
nDCG@10· 2021-04-17
0.228
best: 0.413 (BM25+CE)
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models arXiv:2104.08663
Information RetrievalonPeerQA
MRR
0.4288
best: 0.4845 (Dragon+)
Information RetrievalonPeerQA
Recall@10
0.6388
best: 0.6851 (SPLADEv3)

Knowledge Base9 results

Data IntegrationonAbt-Buy
Candidate Set Size· 2023-03-06
8000
best: 54500 (Sparkly k=50)
SOTA
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Data IntegrationonWDC Block - medium
Candidate Set Size· 2023-03-06
500000
SOTA
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Data IntegrationonWDC Block - medium
Recall· 2023-03-06
97.8
SOTA
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Data IntegrationonWDC Block - large
Candidate Set Size· 2023-03-06
20000000
SOTA
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Data IntegrationonWDC Block - large
Recall· 2023-03-06
95.5
SOTA
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Data IntegrationonWDC Block - small
Candidate Set Size· 2023-03-06
250000
SOTA
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Data IntegrationonAbt-Buy
Recall· 2023-03-06
94.7
best: 99.5 (SC-Block)
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Data IntegrationonAmazon-Google
Candidate Set Size· 2023-03-06
40000
best: 165900 (Sparkly k=50)
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132
Data IntegrationonAmazon-Google
Recall· 2023-03-06
98.7
best: 99.6 (SC-Block)
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines arXiv:2303.03132

Medical1 result

Biomedical Information RetrievalonBioASQ (BEIR)
nDCG@10· 2021-04-17
0.514
best: 0.579 (monoT5-3B)
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models arXiv:2104.08663