Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/RuGPT3XL few-shot

RuGPT3XL few-shot

Reported on 12 benchmarks across 5 tasks

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing12 results

Reading ComprehensiononMuSeRC
Average F1
0.74
best: 0.941 (Golden Transformer)
Reading ComprehensiononMuSeRC
EM
0.546
best: 0.819 (Golden Transformer)
Question AnsweringonDaNetQA
Accuracy
0.59
best: 0.917 (Golden Transformer)
Common Sense ReasoningonRWSD
Accuracy
0.649
best: 0.84 (Human Benchmark)
Common Sense ReasoningonPARus
Accuracy
0.676
best: 0.982 (Human Benchmark)
Common Sense ReasoningonRuCoS
Average F1
0.67
best: 0.93 (Human Benchmark)
Common Sense ReasoningonRuCoS
EM
0.665
best: 0.924 (Golden Transformer)
Word Sense DisambiguationonRUSSE
Accuracy
0.565
best: 0.805 (Human Benchmark)
Natural Language InferenceonRCB
Accuracy
0.418
best: 0.702 (Human Benchmark)
Natural Language InferenceonRCB
Average F1
0.302
best: 0.68 (Human Benchmark)
Natural Language InferenceonLiDiRus
MCC
0.096
best: 0.626 (Human Benchmark)
Natural Language InferenceonTERRa
Accuracy
0.573
best: 0.92 (Human Benchmark)