Metric: Average F1 (higher is better)
| # | Model↕ | Average F1▼ | Extra Data | Paper | Date↕ | Code |
|---|---|---|---|---|---|---|
| 1 | Human Benchmark | 0.93 | No | RussianSuperGLUE: A Russian Language Understandi... | 2020-10-29 | Code |
| 2 | Golden Transformer | 0.92 | No | - | - | - |
| 3 | YaLM 1.0B few-shot | 0.86 | No | - | - | - |
| 4 | ruT5-large-finetune | 0.81 | No | - | - | - |
| 5 | ruT5-base-finetune | 0.79 | No | - | - | - |
| 6 | ruBert-base finetune | 0.74 | No | - | - | - |
| 7 | ruRoberta-large finetune | 0.73 | No | - | - | - |
| 8 | ruBert-large finetune | 0.68 | No | - | - | - |
| 9 | RuGPT3XL few-shot | 0.67 | No | - | - | - |
| 10 | MT5 Large | 0.57 | No | mT5: A massively multilingual pre-trained text-t... | 2020-10-22 | Code |
| 11 | SBERT_Large | 0.36 | No | - | - | - |
| 12 | SBERT_Large_mt_ru_finetuning | 0.35 | No | - | - | - |
| 13 | RuBERT plain | 0.32 | No | - | - | - |
| 14 | Multilingual Bert | 0.29 | No | - | - | - |
| 15 | heuristic majority | 0.26 | No | Unreasonable Effectiveness of Rule-Based Heurist... | 2021-05-03 | - |
| 16 | Baseline TF-IDF1.1 | 0.26 | No | RussianSuperGLUE: A Russian Language Understandi... | 2020-10-29 | Code |
| 17 | Random weighted | 0.25 | No | Unreasonable Effectiveness of Rule-Based Heurist... | 2021-05-03 | - |
| 18 | majority_class | 0.25 | No | Unreasonable Effectiveness of Rule-Based Heurist... | 2021-05-03 | - |
| 19 | RuGPT3Medium | 0.23 | No | - | - | - |
| 20 | RuBERT conversational | 0.22 | No | - | - | - |
| 21 | RuGPT3Small | 0.21 | No | - | - | - |
| 22 | RuGPT3Large | 0.21 | No | - | - | - |