TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/test

test

Reported on 13 benchmarks across 6 tasks · 1 paper

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing16 results

  • Question AnsweringonSQuAD1.1
    EM
    78.087
    best: 90.622 ({ANNA} (single model))
  • Question AnsweringonSQuAD1.1
    F1
    85.348
    best: 95.719 ({ANNA} (single model))
  • Visual Question Answering (VQA)onGQA Test2019
    Accuracy
    53.57
    best: 89.3 (human)
  • Visual Question Answering (VQA)onGQA Test2019
    Binary
    70.15
    best: 91.2 (human)
  • Visual Question Answering (VQA)onGQA Test2019
    Consistency
    81.14
    best: 98.4 (human)
  • Visual Question Answering (VQA)onGQA Test2019
    Distribution
    5.32
    best: 93.08 (GlobalPrior)
  • Visual Question Answering (VQA)onGQA Test2019
    Open
    38.94
    best: 87.4 (human)
  • Visual Question Answering (VQA)onGQA Test2019
    Plausibility
    84.67
    best: 97.2 (human)
  • Visual Question Answering (VQA)onGQA Test2019
    Validity
    96.36
    best: 98.9 (human)
  • Visual Question Answering (VQA)onGQA Test2019
    Accuracy
    47.38
    best: 89.3 (human)
  • Visual Question Answering (VQA)onGQA Test2019
    Binary
    58.76
    best: 91.2 (human)
  • Visual Question Answering (VQA)onGQA Test2019
    Consistency
    73.71
    best: 98.4 (human)
  • Visual Question Answering (VQA)onGQA Test2019
    Distribution
    6.29
    best: 93.08 (GlobalPrior)
  • Visual Question Answering (VQA)onGQA Test2019
    Open
    37.34
    best: 87.4 (human)
  • Visual Question Answering (VQA)onGQA Test2019
    Plausibility
    81.75
    best: 97.2 (human)
  • Visual Question Answering (VQA)onGQA Test2019
    Validity
    94.55
    best: 98.9 (human)

Computer Vision3 results

  • Multi-Object TrackingonnuScenes
    AMOTA
    0.66
    best: 0.763 (MCTrack)
  • Object TrackingonnuScenes
    AMOTA
    0.66
    best: 0.763 (MCTrack)
  • 3D Multi-Object TrackingonnuScenes
    AMOTA
    0.66
    best: 0.763 (MCTrack)

Medical1 result

  • Language ModellingonLAMBADA
    Accuracy· 2019-09-29
    0.01
    best: 89.7 (PaLM-540B (Few-Shot))
    Test-Time Training with Self-Supervision for Generalization under Distribution ShiftsarXiv:1909.13231