Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/Golden Transformer

Golden Transformer

Reported on 12 benchmarks across 5 tasks

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing12 results

Reading ComprehensiononMuSeRC
Average F1
0.941
Reading ComprehensiononMuSeRC
EM
0.819
Question AnsweringonDaNetQA
Accuracy
0.917
Common Sense ReasoningonRWSD
Accuracy
0.545
best: 0.84 (Human Benchmark)
Common Sense ReasoningonPARus
Accuracy
0.908
best: 0.982 (Human Benchmark)
Common Sense ReasoningonRuCoS
Average F1
0.92
best: 0.93 (Human Benchmark)
Common Sense ReasoningonRuCoS
EM
0.924
Word Sense DisambiguationonRUSSE
Accuracy
0.587
best: 0.805 (Human Benchmark)
Natural Language InferenceonRCB
Accuracy
0.546
best: 0.702 (Human Benchmark)
Natural Language InferenceonRCB
Average F1
0.406
best: 0.68 (Human Benchmark)
Natural Language InferenceonLiDiRus
MCC
0
best: 0.626 (Human Benchmark)
Natural Language InferenceonTERRa
Accuracy
0.871
best: 0.92 (Human Benchmark)