Common Sense Reasoning on RuCoS

Metric: EM (higher is better)

LeaderboardDataset

Loading chart...

Results

Sort:

#	Model↕	EM ▼	Extra Data	Paper	Date↕	Code
1	Golden Transformer	0.924	No	-	-	-
2	Human Benchmark	0.89	No	RussianSuperGLUE: A Russian Language Understandi...	2020-10-29	Code
3	YaLM 1.0B few-shot	0.859	No	-	-	-
4	ruT5-large-finetune	0.764	No	-	-	-
5	ruT5-base-finetune	0.752	No	-	-	-
6	ruBert-base finetune	0.716	No	-	-	-
7	ruRoberta-large finetune	0.716	No	-	-	-
8	RuGPT3XL few-shot	0.665	No	-	-	-
9	ruBert-large finetune	0.658	No	-	-	-
10	MT5 Large	0.562	No	mT5: A massively multilingual pre-trained text-t...	2020-10-22	Code
11	SBERT_Large	0.351	No	-	-	-
12	SBERT_Large_mt_ru_finetuning	0.347	No	-	-	-
13	RuBERT plain	0.314	No	-	-	-
14	Multilingual Bert	0.29	No	-	-	-
15	heuristic majority	0.257	No	Unreasonable Effectiveness of Rule-Based Heurist...	2021-05-03	-
16	Baseline TF-IDF1.1	0.252	No	RussianSuperGLUE: A Russian Language Understandi...	2020-10-29	Code
17	Random weighted	0.247	No	Unreasonable Effectiveness of Rule-Based Heurist...	2021-05-03	-
18	majority_class	0.247	No	Unreasonable Effectiveness of Rule-Based Heurist...	2021-05-03	-
19	RuGPT3Medium	0.224	No	-	-	-
20	RuBERT conversational	0.218	No	-	-	-
21	RuGPT3Small	0.204	No	-	-	-
22	RuGPT3Large	0.202	No	-	-	-