TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/FLAN 137B (zero-shot)

FLAN 137B (zero-shot)

Reported on 17 benchmarks across 6 tasks · 1 paper · 1 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing17 results

  • Question AnsweringonOBQA
    Accuracy· 2021-09-03
    78.4
    SOTA
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Machine TranslationonWMT2016 Romanian-English
    BLEU score· 2021-09-03
    37.3
    best: 40.3 (fast-noisy-channel-modeling)
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Machine TranslationonWMT2014 French-English
    BLEU score· 2021-09-03
    35.9
    best: 37.9 (FLAN 137B (few-shot, k=9))
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Machine TranslationonWMT2016 English-German
    BLEU score· 2021-09-03
    27
    best: 40.68 (MADL)
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Machine TranslationonWMT2016 German-English
    BLEU score· 2021-09-03
    38.9
    best: 40.7 (FLAN 137B (few-shot, k=11))
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Machine TranslationonWMT2016 English-Romanian
    BLEU score· 2021-09-03
    18.9
    best: 34.7 (DeLighT)
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Machine TranslationonWMT2014 English-French
    BLEU score· 2021-09-03
    33.9
    best: 46.4 (Transformer+BT (ADMIN init))
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Question AnsweringonCOPA
    Accuracy· 2021-09-03
    91
    best: 100 (PaLM 540B (finetuned) )
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Question AnsweringonMultiRC
    F1· 2021-09-03
    77.5
    best: 90.1 (PaLM 540B (finetuned) )
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Question AnsweringonStoryCloze
    Accuracy· 2021-09-03
    93.4
    best: 96.3 (BLOOMZ)
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Question AnsweringonNaturalQA
    EM· 2021-09-03
    20.7
    best: 41.5 (DPR)
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Question AnsweringonTriviaQA
    EM· 2021-09-03
    56.7
    best: 87.5 (Claude 2 (few-shot, k=5))
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Common Sense ReasoningonARC (Challenge)
    Accuracy· 2021-09-03
    63.1
    best: 96.4 (GPT-4 (few-shot, k=25))
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Common Sense ReasoningonReCoRD
    EM· 2021-09-03
    72.5
    best: 95.9 (Turing NLR v5 XXL 5.4B (fine-tuned))
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Natural Language InferenceonWNLI
    Accuracy· 2021-09-03
    74.6
    best: 95.9 (Turing NLR v5 XXL 5.4B (fine-tuned))
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Sentiment AnalysisonIMDb
    Accuracy· uses extra data· 2021-09-03
    94.3
    best: 96.68 (RoBERTa-large with LlamBERT)
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652
  • Coreference ResolutiononWinograd Schema Challenge
    Accuracy· 2021-09-03
    80.8
    best: 100 (PaLM 540B (fine-tuned))
    Finetuned Language Models Are Zero-Shot LearnersarXiv:2109.01652