TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/BM25

BM25

Reported on 30 benchmarks across 7 tasks · 7 papers · 13 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing20 results

  • Entity ResolutiononAbt-Buy
    Candidate Set Size· 2023-03-06
    8000
    best: 54500 (Sparkly k=50)
    SOTA
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Entity ResolutiononWDC Block - medium
    Candidate Set Size· 2023-03-06
    500000
    SOTA
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Entity ResolutiononWDC Block - medium
    Recall· 2023-03-06
    97.8
    SOTA
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Entity ResolutiononWDC Block - large
    Candidate Set Size· 2023-03-06
    20000000
    SOTA
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Entity ResolutiononWDC Block - large
    Recall· 2023-03-06
    95.5
    SOTA
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Entity ResolutiononWDC Block - small
    Candidate Set Size· 2023-03-06
    250000
    SOTA
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Information RetrievalonEntityQuestions
    Recall@20· 2021-09-17
    0.72
    best: 0.838 (TOME-2)
    SOTA
    Simple Entity-Centric Questions Challenge Dense RetrieversarXiv:2109.08535
  • Information RetrievalonBEIR
    NCDG@10· 2023-05-23
    44.48
    best: 50.43 ($\ell_0$ Mask)
    BM25 Query Augmentation Learned End-to-EndarXiv:2305.14087
  • Entity ResolutiononAbt-Buy
    Recall· 2023-03-06
    94.7
    best: 99.5 (SC-Block)
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Entity ResolutiononAmazon-Google
    Candidate Set Size· 2023-03-06
    40000
    best: 165900 (Sparkly k=50)
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Entity ResolutiononAmazon-Google
    Recall· 2023-03-06
    98.7
    best: 99.6 (SC-Block)
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Cross-LingualonSV-Ident
    mAP@10· 2022-09-19
    9.43
    best: 18.93 (sentence-transformers/distiluse-base-multilingual-cased-v1)
    Overview of the SV-Ident 2022 Shared Task on Survey Variable Identification in Social Science PublicationsarXiv:2209.09062
  • Cross-Lingual Entity LinkingonSV-Ident
    mAP@10· 2022-09-19
    9.43
    best: 18.93 (sentence-transformers/distiluse-base-multilingual-cased-v1)
    Overview of the SV-Ident 2022 Shared Task on Survey Variable Identification in Social Science PublicationsarXiv:2209.09062
  • Passage RankingonMS MARCO
    MRR@10· 2022-01-24
    18.4
    best: 44.3 (Fine-tuned SOTA)
    Text and Code Embeddings by Contrastive Pre-TrainingarXiv:2201.10005
  • Information RetrievalonBSARD
    Recall@100· 2021-08-26
    51.33
    best: 74.78 (Two-tower Bi-Encoder (RoBERTa))
    A Statutory Article Retrieval Dataset in FrencharXiv:2108.11792
  • Information RetrievalonBSARD
    Recall@200· 2021-08-26
    56.78
    best: 78.38 (Siamese Bi-Encoder (RoBERTa))
    A Statutory Article Retrieval Dataset in FrencharXiv:2108.11792
  • Information RetrievalonBSARD
    Recall@500· 2021-08-26
    64.71
    best: 83.77 (Siamese Bi-Encoder (RoBERTa))
    A Statutory Article Retrieval Dataset in FrencharXiv:2108.11792
  • Information RetrievalonMSMARCO (BEIR)
    nDCG@10· 2021-04-17
    0.228
    best: 0.413 (BM25+CE)
    BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval ModelsarXiv:2104.08663
  • Information RetrievalonPeerQA
    MRR
    0.4288
    best: 0.4845 (Dragon+)
  • Information RetrievalonPeerQA
    Recall@10
    0.6388
    best: 0.6851 (SPLADEv3)

Knowledge Base9 results

  • Data IntegrationonAbt-Buy
    Candidate Set Size· 2023-03-06
    8000
    best: 54500 (Sparkly k=50)
    SOTA
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Data IntegrationonWDC Block - medium
    Candidate Set Size· 2023-03-06
    500000
    SOTA
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Data IntegrationonWDC Block - medium
    Recall· 2023-03-06
    97.8
    SOTA
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Data IntegrationonWDC Block - large
    Candidate Set Size· 2023-03-06
    20000000
    SOTA
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Data IntegrationonWDC Block - large
    Recall· 2023-03-06
    95.5
    SOTA
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Data IntegrationonWDC Block - small
    Candidate Set Size· 2023-03-06
    250000
    SOTA
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Data IntegrationonAbt-Buy
    Recall· 2023-03-06
    94.7
    best: 99.5 (SC-Block)
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Data IntegrationonAmazon-Google
    Candidate Set Size· 2023-03-06
    40000
    best: 165900 (Sparkly k=50)
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132
  • Data IntegrationonAmazon-Google
    Recall· 2023-03-06
    98.7
    best: 99.6 (SC-Block)
    SC-Block: Supervised Contrastive Blocking within Entity Resolution PipelinesarXiv:2303.03132

Medical1 result

  • Biomedical Information RetrievalonBioASQ (BEIR)
    nDCG@10· 2021-04-17
    0.514
    best: 0.579 (monoT5-3B)
    BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval ModelsarXiv:2104.08663