Word Sense Disambiguation on Words in Context

Metric: Accuracy (higher is better)

LeaderboardDataset

Loading chart...

Results

#	Model↕	Accuracy▼	Extra Data	Paper	Date↕	Code
1	COSINE + Transductive Learning	85.3	No	Fine-Tuning Pre-trained Language Model with Weak...	2020-10-15	Code
2	PaLM 540B (finetuned)	78.8	No	PaLM: Scaling Language Modeling with Pathways	2022-04-05	Code
3	ST-MoE-32B 269B (fine-tuned)	77.7	No	ST-MoE: Designing Stable and Transferable Sparse...	2022-02-17	Code
4	DeBERTa-Ensemble	77.5	No	DeBERTa: Decoding-enhanced BERT with Disentangle...	2020-06-05	Code
5	Vega v2 6B (fine-tuned)	77.4	No	Toward Efficient Language Model Pretraining and ...	2022-12-04	-
6	UL2 20B (fine-tuned)	77.3	No	UL2: Unifying Language Learning Paradigms	2022-05-10	Code
7	Turing NLR v5 XXL 5.4B (fine-tuned)	77.1	No	Toward Efficient Language Model Pretraining and ...	2022-12-04	-
8	T5-XXL 11B	76.9	No	Exploring the Limits of Transfer Learning with a...	2019-10-23	Code
9	DeBERTa-1.5B	76.4	No	DeBERTa: Decoding-enhanced BERT with Disentangle...	2020-06-05	Code
10	ST-MoE-L 4.1B (fine-tuned)	74	No	ST-MoE: Designing Stable and Transferable Sparse...	2022-02-17	Code
11	SenseBERT-large 340M	72.1	No	SenseBERT: Driving Some Sense into BERT	2019-08-15	-
12	SenseBERT-base 110M	70.3	No	SenseBERT: Driving Some Sense into BERT	2019-08-15	-
13	PaLM 2-L (one-shot)	66.8	No	PaLM 2 Technical Report	2023-05-17	Code
14	BERT-large 340M	65.5	No	WiC: the Word-in-Context Dataset for Evaluating ...	2018-08-28	-
15	FLAN-T5-Large 783M	64.7	No	LaMini-LM: A Diverse Herd of Distilled Models fr...	2023-04-27	Code
16	LaMini-F-T5 783M	63.8	No	LaMini-LM: A Diverse Herd of Distilled Models fr...	2023-04-27	Code
17	Context2vec	59.3	No	WiC: the Word-in-Context Dataset for Evaluating ...	2018-08-28	-
18	DeConf	58.7	No	WiC: the Word-in-Context Dataset for Evaluating ...	2018-08-28	-
19	SW2V	58.1	No	WiC: the Word-in-Context Dataset for Evaluating ...	2018-08-28	-
20	ElMo	57.7	No	WiC: the Word-in-Context Dataset for Evaluating ...	2018-08-28	-
21	T0-3B (CoT fine-tuned)	56.7	No	The CoT Collection: Improving Zero-shot and Few-...	2023-05-23	Code
22	N-Grammer 343M	56.1	No	N-Grammer: Augmenting Transformers with latent n...	2022-07-13	Code
23	AlexaTM 20B	53.3	No	AlexaTM 20B: Few-Shot Learning Using a Large-Sca...	2022-08-02	Code
24	Sentence LSTM	53.1	No	WiC: the Word-in-Context Dataset for Evaluating ...	2018-08-28	-
25	RoE-3B	52.97	No	Exploring the Benefits of Training Expert Langua...	2023-02-07	Code
26	LaMini-GPT 1.5B	52.4	No	LaMini-LM: A Diverse Herd of Distilled Models fr...	2023-04-27	Code
27	KiC-770M	52.4	No	Knowledge-in-Context: Towards Knowledgeable Semi...	2022-10-28	-
28	PaLM 2-M (one-shot)	52	No	PaLM 2 Technical Report	2023-05-17	Code
29	Hybrid H3 125M (0-shot, logit scoring)	51.4	No	Hungry Hungry Hippos: Towards Language Modeling ...	2022-12-28	Code
30	Hybrid H3 125M (0-shot, rank classification)	51.4	No	Hungry Hungry Hippos: Towards Language Modeling ...	2022-12-28	Code
31	PaLM 2-S (one-shot)	50.6	No	PaLM 2 Technical Report	2023-05-17	Code
32	LaMini-T5 738M	50.5	No	LaMini-LM: A Diverse Herd of Distilled Models fr...	2023-04-27	Code
33	Flipped-3B	50.42	No	Guess the Instruction! Flipped Learning Makes La...	2022-10-06	Code
34	GPT-2-XL 1.5B	49.8	No	LaMini-LM: A Diverse Herd of Distilled Models fr...	2023-04-27	Code
35	UL2 20B (0-shot)	49.8	No	UL2: Unifying Language Learning Paradigms	2022-05-10	Code
36	GPT-3 175B (few-shot, k=32)	49.4	No	Language Models are Few-Shot Learners	2020-05-28	Code
37	Hybrid H3 125M (3-shot, logit scoring)	49.1	No	Hungry Hungry Hippos: Towards Language Modeling ...	2022-12-28	Code