Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Natural Language Processing
/
Common Sense Reasoning
/
RuCoS
Common Sense Reasoning on RuCoS
Metric: Average F1 (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Export CSV
Sort:
Average F1 (best first)
Average F1 (worst first)
Date (newest first)
Date (oldest first)
Model name (A→Z)
#
Model
↕
Average F1
▼
Extra Data
Paper
Date
↕
Code
1
Human Benchmark
0.93
No
RussianSuperGLUE: A Russian Language Understandi...
2020-10-29
Code
2
Golden Transformer
0.92
No
-
-
-
3
YaLM 1.0B few-shot
0.86
No
-
-
-
4
ruT5-large-finetune
0.81
No
-
-
-
5
ruT5-base-finetune
0.79
No
-
-
-
6
ruBert-base finetune
0.74
No
-
-
-
7
ruRoberta-large finetune
0.73
No
-
-
-
8
ruBert-large finetune
0.68
No
-
-
-
9
RuGPT3XL few-shot
0.67
No
-
-
-
10
MT5 Large
0.57
No
mT5: A massively multilingual pre-trained text-t...
2020-10-22
Code
11
SBERT_Large
0.36
No
-
-
-
12
SBERT_Large_mt_ru_finetuning
0.35
No
-
-
-
13
RuBERT plain
0.32
No
-
-
-
14
Multilingual Bert
0.29
No
-
-
-
15
heuristic majority
0.26
No
Unreasonable Effectiveness of Rule-Based Heurist...
2021-05-03
-
16
Baseline TF-IDF1.1
0.26
No
RussianSuperGLUE: A Russian Language Understandi...
2020-10-29
Code
17
Random weighted
0.25
No
Unreasonable Effectiveness of Rule-Based Heurist...
2021-05-03
-
18
majority_class
0.25
No
Unreasonable Effectiveness of Rule-Based Heurist...
2021-05-03
-
19
RuGPT3Medium
0.23
No
-
-
-
20
RuBERT conversational
0.22
No
-
-
-
21
RuGPT3Small
0.21
No
-
-
-
22
RuGPT3Large
0.21
No
-
-
-
#1
Human Benchmark
SOTA
0.93
Average F1
· 2020-10-29
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
Code
#2
Golden Transformer
0.92
Average F1
No paper
#3
YaLM 1.0B few-shot
0.86
Average F1
No paper
#4
ruT5-large-finetune
0.81
Average F1
No paper
#5
ruT5-base-finetune
0.79
Average F1
No paper
#6
ruBert-base finetune
0.74
Average F1
No paper
#7
ruRoberta-large finetune
0.73
Average F1
No paper
#8
ruBert-large finetune
0.68
Average F1
No paper
#9
RuGPT3XL few-shot
0.67
Average F1
No paper
#10
MT5 Large
SOTA
0.57
Average F1
· 2020-10-22
mT5: A massively multilingual pre-trained text-to-text transformer
Code
#11
SBERT_Large
0.36
Average F1
No paper
#12
SBERT_Large_mt_ru_finetuning
0.35
Average F1
No paper
#13
RuBERT plain
0.32
Average F1
No paper
#14
Multilingual Bert
0.29
Average F1
No paper
#15
heuristic majority
0.26
Average F1
· 2021-05-03
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks
#16
Baseline TF-IDF1.1
0.26
Average F1
· 2020-10-29
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
Code
#17
Random weighted
0.25
Average F1
· 2021-05-03
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks
#18
majority_class
0.25
Average F1
· 2021-05-03
Unreasonable Effectiveness of Rule-Based Heuristics in Solving Russian SuperGLUE Tasks
#19
RuGPT3Medium
0.23
Average F1
No paper
#20
RuBERT conversational
0.22
Average F1
No paper
#21
RuGPT3Small
0.21
Average F1
No paper
#22
RuGPT3Large
0.21
Average F1
No paper