TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/aa_evalai

aa_evalai

Reported on 10 benchmarks across 2 tasks

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing10 results

  • Fact VerificationonKILT: FEVER
    Accuracy
    88.45
    best: 89.55 (Re2G)
  • Fact VerificationonKILT: FEVER
    KILT-AC
    0
    best: 78.53 (Re2G)
  • Fact VerificationonKILT: FEVER
    R-Prec
    0
    best: 88.92 (Re2G)
  • Fact VerificationonKILT: FEVER
    Recall@5
    0
    best: 92.52 (Re2G)
  • Open-Domain DialogonKILT: Wizard of Wikipedia
    F1
    17.3
    best: 19.19 (Hindsight)
  • Open-Domain DialogonKILT: Wizard of Wikipedia
    KILT-F1
    0
    best: 13.39 (Hindsight)
  • Open-Domain DialogonKILT: Wizard of Wikipedia
    KILT-RL
    0
    best: 11.92 (Hindsight)
  • Open-Domain DialogonKILT: Wizard of Wikipedia
    R-Prec
    0
    best: 64.79 (chriskuei)
  • Open-Domain DialogonKILT: Wizard of Wikipedia
    ROUGE-L
    15.93
    best: 17.06 (Hindsight)
  • Open-Domain DialogonKILT: Wizard of Wikipedia
    Recall@5
    0
    best: 82.15 (chriskuei)