Tasks SotA Datasets Papers Methods Submit About

Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable Benchmarks All SotA Datasets Papers Methods

Community

Submit Results About

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/aa_evalai

aa_evalai

Reported on 10 benchmarks across 2 tasks

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing10 results

Fact VerificationonKILT: FEVER
Accuracy
88.45
best: 89.55 (Re2G)
Fact VerificationonKILT: FEVER
KILT-AC
0
best: 78.53 (Re2G)
Fact VerificationonKILT: FEVER
R-Prec
0
best: 88.92 (Re2G)
Fact VerificationonKILT: FEVER
Recall@5
0
best: 92.52 (Re2G)
Open-Domain DialogonKILT: Wizard of Wikipedia
F1
17.3
best: 19.19 (Hindsight)
Open-Domain DialogonKILT: Wizard of Wikipedia
KILT-F1
0
best: 13.39 (Hindsight)
Open-Domain DialogonKILT: Wizard of Wikipedia
KILT-RL
0
best: 11.92 (Hindsight)
Open-Domain DialogonKILT: Wizard of Wikipedia
R-Prec
0
best: 64.79 (chriskuei)
Open-Domain DialogonKILT: Wizard of Wikipedia
ROUGE-L
15.93
best: 17.06 (Hindsight)
Open-Domain DialogonKILT: Wizard of Wikipedia
Recall@5
0
best: 82.15 (chriskuei)