TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/WiC-TSV: An Evaluation Benchmark for Target Sense Verifica...

WiC-TSV: An Evaluation Benchmark for Target Sense Verification of Words in Context

Anna Breit, Artem Revenko, Kiamehr Rezaee, Mohammad Taher Pilehvar, Jose Camacho-Collados

2020-04-30EACL 2021 2Binary ClassificationEntity LinkingWord Sense Disambiguation
PaperPDFCode(official)

Abstract

We present WiC-TSV, a new multi-domain evaluation benchmark for Word Sense Disambiguation. More specifically, we introduce a framework for Target Sense Verification of Words in Context which grounds its uniqueness in the formulation as a binary classification task thus being independent of external sense inventories, and the coverage of various domains. This makes the dataset highly flexible for the evaluation of a diverse set of models and systems in and across domains. WiC-TSV provides three different evaluation settings, depending on the input signals provided to the model. We set baseline performance on the dataset using state-of-the-art language models. Experimental results show that even though these models can perform decently on the task, there remains a gap between machine and human performance, especially in out-of-domain settings. WiC-TSV data is available at https://competitions.codalab.org/competitions/23683

Results

TaskDatasetMetricValueModel
Word Sense DisambiguationWiC-TSVTask 1 Accuracy: all75.3Bert-base
Word Sense DisambiguationWiC-TSVTask 1 Accuracy: domain specific77.9Bert-base
Word Sense DisambiguationWiC-TSVTask 1 Accuracy: general purpose73.3Bert-base
Word Sense DisambiguationWiC-TSVTask 2 Accuracy: all71.7Bert-base
Word Sense DisambiguationWiC-TSVTask 2 Accuracy: domain specific74.7Bert-base
Word Sense DisambiguationWiC-TSVTask 2 Accuracy: general purpose68.6Bert-base
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: all76.6Bert-base
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: domain specific80.4Bert-base
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: general purpose73.5Bert-base
Word Sense DisambiguationWiC-TSVTask 1 Accuracy: all54.4Unsupervised Bert
Word Sense DisambiguationWiC-TSVTask 1 Accuracy: domain specific60.6Unsupervised Bert
Word Sense DisambiguationWiC-TSVTask 1 Accuracy: general purpose49.2Unsupervised Bert
Word Sense DisambiguationWiC-TSVTask 2 Accuracy: all62.8Unsupervised Bert
Word Sense DisambiguationWiC-TSVTask 2 Accuracy: domain specific69.1Unsupervised Bert
Word Sense DisambiguationWiC-TSVTask 2 Accuracy: general purpose57.6Unsupervised Bert
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: all60.5Unsupervised Bert
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: domain specific67.9Unsupervised Bert
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: general purpose54.4Unsupervised Bert
Word Sense DisambiguationWiC-TSVTask 1 Accuracy: all53.7FastText
Word Sense DisambiguationWiC-TSVTask 1 Accuracy: domain specific50.6FastText
Word Sense DisambiguationWiC-TSVTask 1 Accuracy: general purpose56.2FastText
Word Sense DisambiguationWiC-TSVTask 2 Accuracy: all52.7FastText
Word Sense DisambiguationWiC-TSVTask 2 Accuracy: domain specific47.7FastText
Word Sense DisambiguationWiC-TSVTask 2 Accuracy: general purpose56.8FastText
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: all53.4FastText
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: domain specific49FastText
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: general purpose57.1FastText
Word Sense DisambiguationWiC-TSVTask 1 Accuracy: all50.8All true
Word Sense DisambiguationWiC-TSVTask 1 Accuracy: domain specific47All true
Word Sense DisambiguationWiC-TSVTask 1 Accuracy: general purpose53.8All true
Word Sense DisambiguationWiC-TSVTask 2 Accuracy: all50.8All true
Word Sense DisambiguationWiC-TSVTask 2 Accuracy: domain specific47All true
Word Sense DisambiguationWiC-TSVTask 2 Accuracy: general purpose53.8All true
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: all50.8All true
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: domain specific47All true
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: general purpose53.8All true
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: all85.3Human
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: domain specific89.2Human
Word Sense DisambiguationWiC-TSVTask 3 Accuracy: general purpose82.1Human
Sentiment AnalysisTweetEvalHate50.6FastText
Entity LinkingWiC-TSVTask 1 Accuracy: all75.3Bert-base
Entity LinkingWiC-TSVTask 1 Accuracy: domain specific77.9Bert-base
Entity LinkingWiC-TSVTask 1 Accuracy: general purpose73.3Bert-base
Entity LinkingWiC-TSVTask 2 Accuracy: all71.7Bert-base
Entity LinkingWiC-TSVTask 2 Accuracy: domain specific74.7Bert-base
Entity LinkingWiC-TSVTask 2 Accuracy: general purpose68.6Bert-base
Entity LinkingWiC-TSVTask 3 Accuracy: all76.6Bert-base
Entity LinkingWiC-TSVTask 3 Accuracy: domain specific80.4Bert-base
Entity LinkingWiC-TSVTask 3 Accuracy: general purpose73.5Bert-base
Entity LinkingWiC-TSVTask 1 Accuracy: all54.4Unsupervised Bert
Entity LinkingWiC-TSVTask 1 Accuracy: domain specific60.6Unsupervised Bert
Entity LinkingWiC-TSVTask 1 Accuracy: general purpose49.2Unsupervised Bert
Entity LinkingWiC-TSVTask 2 Accuracy: all62.8Unsupervised Bert
Entity LinkingWiC-TSVTask 2 Accuracy: domain specific69.1Unsupervised Bert
Entity LinkingWiC-TSVTask 2 Accuracy: general purpose57.6Unsupervised Bert
Entity LinkingWiC-TSVTask 3 Accuracy: all60.5Unsupervised Bert
Entity LinkingWiC-TSVTask 3 Accuracy: domain specific67.9Unsupervised Bert
Entity LinkingWiC-TSVTask 3 Accuracy: general purpose54.4Unsupervised Bert
Entity LinkingWiC-TSVTask 1 Accuracy: all53.7FastText
Entity LinkingWiC-TSVTask 1 Accuracy: domain specific50.6FastText
Entity LinkingWiC-TSVTask 1 Accuracy: general purpose56.2FastText
Entity LinkingWiC-TSVTask 2 Accuracy: all52.7FastText
Entity LinkingWiC-TSVTask 2 Accuracy: domain specific47.7FastText
Entity LinkingWiC-TSVTask 2 Accuracy: general purpose56.8FastText
Entity LinkingWiC-TSVTask 3 Accuracy: all53.4FastText
Entity LinkingWiC-TSVTask 3 Accuracy: domain specific49FastText
Entity LinkingWiC-TSVTask 3 Accuracy: general purpose57.1FastText
Entity LinkingWiC-TSVTask 1 Accuracy: all50.8All true
Entity LinkingWiC-TSVTask 1 Accuracy: domain specific47All true
Entity LinkingWiC-TSVTask 1 Accuracy: general purpose53.8All true
Entity LinkingWiC-TSVTask 2 Accuracy: all50.8All true
Entity LinkingWiC-TSVTask 2 Accuracy: domain specific47All true
Entity LinkingWiC-TSVTask 2 Accuracy: general purpose53.8All true
Entity LinkingWiC-TSVTask 3 Accuracy: all50.8All true
Entity LinkingWiC-TSVTask 3 Accuracy: domain specific47All true
Entity LinkingWiC-TSVTask 3 Accuracy: general purpose53.8All true
Entity LinkingWiC-TSVTask 3 Accuracy: all85.3Human
Entity LinkingWiC-TSVTask 3 Accuracy: domain specific89.2Human
Entity LinkingWiC-TSVTask 3 Accuracy: general purpose82.1Human

Related Papers

An Automated Classifier of Harmful Brain Activities for Clinical Usage Based on a Vision-Inspired Pre-trained Framework2025-07-10DDL: A Dataset for Interpretable Deepfake Detection and Localization in Real-World Scenarios2025-06-29Inverse Scene Text Removal2025-06-26Divide, Specialize, and Route: A New Approach to Efficient Ensemble Learning2025-06-25Private Model Personalization Revisited2025-06-24Semantic similarity estimation for domain specific data using BERT and other techniques2025-06-23Exploring Strategies for Personalized Radiation Therapy Part I Unlocking Response-Related Tumor Subregions with Class Activation Mapping2025-06-21I Know Which LLM Wrote Your Code Last Summer: LLM generated Code Stylometry for Authorship Attribution2025-06-18