Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Word Sense Disambiguation
/
Words in Context
Word Sense Disambiguation on Words in Context
Metric: Accuracy (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
Accuracy (best first)
Accuracy (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Accuracy
▼
Extra Data
Paper
Date
↕
Code
1
COSINE + Transductive Learning
85.3
No
Fine-Tuning Pre-trained Language Model with Weak...
2020-10-15
Code
2
PaLM 540B (finetuned)
78.8
No
PaLM: Scaling Language Modeling with Pathways
2022-04-05
Code
3
ST-MoE-32B 269B (fine-tuned)
77.7
No
ST-MoE: Designing Stable and Transferable Sparse...
2022-02-17
Code
4
DeBERTa-Ensemble
77.5
No
DeBERTa: Decoding-enhanced BERT with Disentangle...
2020-06-05
Code
5
Vega v2 6B (fine-tuned)
77.4
No
Toward Efficient Language Model Pretraining and ...
2022-12-04
-
6
UL2 20B (fine-tuned)
77.3
No
UL2: Unifying Language Learning Paradigms
2022-05-10
Code
7
Turing NLR v5 XXL 5.4B (fine-tuned)
77.1
No
Toward Efficient Language Model Pretraining and ...
2022-12-04
-
8
T5-XXL 11B
76.9
No
Exploring the Limits of Transfer Learning with a...
2019-10-23
Code
9
DeBERTa-1.5B
76.4
No
DeBERTa: Decoding-enhanced BERT with Disentangle...
2020-06-05
Code
10
ST-MoE-L 4.1B (fine-tuned)
74
No
ST-MoE: Designing Stable and Transferable Sparse...
2022-02-17
Code
11
SenseBERT-large 340M
72.1
No
SenseBERT: Driving Some Sense into BERT
2019-08-15
-
12
SenseBERT-base 110M
70.3
No
SenseBERT: Driving Some Sense into BERT
2019-08-15
-
13
PaLM 2-L (one-shot)
66.8
No
PaLM 2 Technical Report
2023-05-17
Code
14
BERT-large 340M
65.5
No
WiC: the Word-in-Context Dataset for Evaluating ...
2018-08-28
-
15
FLAN-T5-Large 783M
64.7
No
LaMini-LM: A Diverse Herd of Distilled Models fr...
2023-04-27
Code
16
LaMini-F-T5 783M
63.8
No
LaMini-LM: A Diverse Herd of Distilled Models fr...
2023-04-27
Code
17
Context2vec
59.3
No
WiC: the Word-in-Context Dataset for Evaluating ...
2018-08-28
-
18
DeConf
58.7
No
WiC: the Word-in-Context Dataset for Evaluating ...
2018-08-28
-
19
SW2V
58.1
No
WiC: the Word-in-Context Dataset for Evaluating ...
2018-08-28
-
20
ElMo
57.7
No
WiC: the Word-in-Context Dataset for Evaluating ...
2018-08-28
-
21
T0-3B (CoT fine-tuned)
56.7
No
The CoT Collection: Improving Zero-shot and Few-...
2023-05-23
Code
22
N-Grammer 343M
56.1
No
N-Grammer: Augmenting Transformers with latent n...
2022-07-13
Code
23
AlexaTM 20B
53.3
No
AlexaTM 20B: Few-Shot Learning Using a Large-Sca...
2022-08-02
Code
24
Sentence LSTM
53.1
No
WiC: the Word-in-Context Dataset for Evaluating ...
2018-08-28
-
25
RoE-3B
52.97
No
Exploring the Benefits of Training Expert Langua...
2023-02-07
Code
26
LaMini-GPT 1.5B
52.4
No
LaMini-LM: A Diverse Herd of Distilled Models fr...
2023-04-27
Code
27
KiC-770M
52.4
No
Knowledge-in-Context: Towards Knowledgeable Semi...
2022-10-28
-
28
PaLM 2-M (one-shot)
52
No
PaLM 2 Technical Report
2023-05-17
Code
29
Hybrid H3 125M (0-shot, logit scoring)
51.4
No
Hungry Hungry Hippos: Towards Language Modeling ...
2022-12-28
Code
30
Hybrid H3 125M (0-shot, rank classification)
51.4
No
Hungry Hungry Hippos: Towards Language Modeling ...
2022-12-28
Code
31
PaLM 2-S (one-shot)
50.6
No
PaLM 2 Technical Report
2023-05-17
Code
32
LaMini-T5 738M
50.5
No
LaMini-LM: A Diverse Herd of Distilled Models fr...
2023-04-27
Code
33
Flipped-3B
50.42
No
Guess the Instruction! Flipped Learning Makes La...
2022-10-06
Code
34
GPT-2-XL 1.5B
49.8
No
LaMini-LM: A Diverse Herd of Distilled Models fr...
2023-04-27
Code
35
UL2 20B (0-shot)
49.8
No
UL2: Unifying Language Learning Paradigms
2022-05-10
Code
36
GPT-3 175B (few-shot, k=32)
49.4
No
Language Models are Few-Shot Learners
2020-05-28
Code
37
Hybrid H3 125M (3-shot, logit scoring)
49.1
No
Hungry Hungry Hippos: Towards Language Modeling ...
2022-12-28
Code
#1
COSINE + Transductive Learning
SOTA
85.3
Accuracy
· 2020-10-15
Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self-Training Approach
Code
#2
PaLM 540B (finetuned)
78.8
Accuracy
· 2022-04-05
PaLM: Scaling Language Modeling with Pathways
Code
#3
ST-MoE-32B 269B (fine-tuned)
77.7
Accuracy
· 2022-02-17
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Code
#4
DeBERTa-Ensemble
SOTA
77.5
Accuracy
· 2020-06-05
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Code
#5
Vega v2 6B (fine-tuned)
77.4
Accuracy
· 2022-12-04
Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE
#6
UL2 20B (fine-tuned)
77.3
Accuracy
· 2022-05-10
UL2: Unifying Language Learning Paradigms
Code
#7
Turing NLR v5 XXL 5.4B (fine-tuned)
77.1
Accuracy
· 2022-12-04
Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE
#8
T5-XXL 11B
SOTA
76.9
Accuracy
· 2019-10-23
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Code
#9
DeBERTa-1.5B
76.4
Accuracy
· 2020-06-05
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Code
#10
ST-MoE-L 4.1B (fine-tuned)
74
Accuracy
· 2022-02-17
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Code
#11
SenseBERT-large 340M
SOTA
72.1
Accuracy
· 2019-08-15
SenseBERT: Driving Some Sense into BERT
#12
SenseBERT-base 110M
70.3
Accuracy
· 2019-08-15
SenseBERT: Driving Some Sense into BERT
#13
PaLM 2-L (one-shot)
66.8
Accuracy
· 2023-05-17
PaLM 2 Technical Report
Code
#14
BERT-large 340M
SOTA
65.5
Accuracy
· 2018-08-28
WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations
#15
FLAN-T5-Large 783M
64.7
Accuracy
· 2023-04-27
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Code
#16
LaMini-F-T5 783M
63.8
Accuracy
· 2023-04-27
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Code
#17
Context2vec
59.3
Accuracy
· 2018-08-28
WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations
#18
DeConf
58.7
Accuracy
· 2018-08-28
WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations
#19
SW2V
58.1
Accuracy
· 2018-08-28
WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations
#20
ElMo
57.7
Accuracy
· 2018-08-28
WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations
#21
T0-3B (CoT fine-tuned)
56.7
Accuracy
· 2023-05-23
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
Code
#22
N-Grammer 343M
56.1
Accuracy
· 2022-07-13
N-Grammer: Augmenting Transformers with latent n-grams
Code
#23
AlexaTM 20B
53.3
Accuracy
· 2022-08-02
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Code
#24
Sentence LSTM
53.1
Accuracy
· 2018-08-28
WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations
#25
RoE-3B
52.97
Accuracy
· 2023-02-07
Exploring the Benefits of Training Expert Language Models over Instruction Tuning
Code
#26
LaMini-GPT 1.5B
52.4
Accuracy
· 2023-04-27
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Code
#27
KiC-770M
52.4
Accuracy
· 2022-10-28
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models
#28
PaLM 2-M (one-shot)
52
Accuracy
· 2023-05-17
PaLM 2 Technical Report
Code
#29
Hybrid H3 125M (0-shot, logit scoring)
51.4
Accuracy
· 2022-12-28
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Code
#30
Hybrid H3 125M (0-shot, rank classification)
51.4
Accuracy
· 2022-12-28
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Code
#31
PaLM 2-S (one-shot)
50.6
Accuracy
· 2023-05-17
PaLM 2 Technical Report
Code
#32
LaMini-T5 738M
50.5
Accuracy
· 2023-04-27
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Code
#33
Flipped-3B
50.42
Accuracy
· 2022-10-06
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
Code
#34
GPT-2-XL 1.5B
49.8
Accuracy
· 2023-04-27
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
Code
#35
UL2 20B (0-shot)
49.8
Accuracy
· 2022-05-10
UL2: Unifying Language Learning Paradigms
Code
#36
GPT-3 175B (few-shot, k=32)
49.4
Accuracy
· 2020-05-28
Language Models are Few-Shot Learners
Code
#37
Hybrid H3 125M (3-shot, logit scoring)
49.1
Accuracy
· 2022-12-28
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Code