TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Models/OPT-175B

OPT-175B

Reported on 16 benchmarks across 4 tasks · 2 papers · 10 SOTA

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing14 results

  • Stereotypical Bias AnalysisonCrowS-Pairs
    Age· 2022-05-02
    67.8
    best: 70.1 (LLaMA 65B)
    SOTA
    OPT: Open Pre-trained Transformer Language ModelsarXiv:2205.01068
  • Stereotypical Bias AnalysisonCrowS-Pairs
    Gender· 2022-05-02
    65.7
    best: 70.6 (LLaMA 65B)
    SOTA
    OPT: Open Pre-trained Transformer Language ModelsarXiv:2205.01068
  • Stereotypical Bias AnalysisonCrowS-Pairs
    Nationality· 2022-05-02
    62.9
    best: 64.2 (LLaMA 65B)
    SOTA
    OPT: Open Pre-trained Transformer Language ModelsarXiv:2205.01068
  • Stereotypical Bias AnalysisonCrowS-Pairs
    Overall· 2022-05-02
    69.5
    SOTA
    OPT: Open Pre-trained Transformer Language ModelsarXiv:2205.01068
  • Stereotypical Bias AnalysisonCrowS-Pairs
    Physical Appearance· 2022-05-02
    76.2
    best: 77.8 (LLaMA 65B)
    SOTA
    OPT: Open Pre-trained Transformer Language ModelsarXiv:2205.01068
  • Stereotypical Bias AnalysisonCrowS-Pairs
    Race/Color· 2022-05-02
    68.6
    SOTA
    OPT: Open Pre-trained Transformer Language ModelsarXiv:2205.01068
  • Stereotypical Bias AnalysisonCrowS-Pairs
    Religion· 2022-05-02
    65.7
    best: 70.6 (LLaMA 65B)
    SOTA
    OPT: Open Pre-trained Transformer Language ModelsarXiv:2205.01068
  • Stereotypical Bias AnalysisonCrowS-Pairs
    Sexual Orientation· 2022-05-02
    78.6
    best: 81 (LLaMA 65B)
    SOTA
    OPT: Open Pre-trained Transformer Language ModelsarXiv:2205.01068
  • Stereotypical Bias AnalysisonCrowS-Pairs
    Socioeconomic status· 2022-05-02
    76.2
    SOTA
    OPT: Open Pre-trained Transformer Language ModelsarXiv:2205.01068
  • Question AnsweringonPIQA
    Accuracy· 2023-01-02
    81.07
    best: 90.1 (Unicorn 11B (fine-tuned))
    SparseGPT: Massive Language Models Can Be Accurately Pruned in One-ShotarXiv:2301.00774
  • Question AnsweringonStoryCloze
    Accuracy· 2023-01-02
    79.82
    best: 96.3 (BLOOMZ)
    SparseGPT: Massive Language Models Can Be Accurately Pruned in One-ShotarXiv:2301.00774
  • Common Sense ReasoningonARC (Challenge)
    Accuracy· 2023-01-02
    43.94
    best: 96.4 (GPT-4 (few-shot, k=25))
    SparseGPT: Massive Language Models Can Be Accurately Pruned in One-ShotarXiv:2301.00774
  • Common Sense ReasoningonARC (Easy)
    Accuracy· 2023-01-02
    71.04
    best: 95.2 (ST-MoE-32B 269B (fine-tuned))
    SparseGPT: Massive Language Models Can Be Accurately Pruned in One-ShotarXiv:2301.00774
  • Stereotypical Bias AnalysisonCrowS-Pairs
    Disability· 2022-05-02
    76.7
    OPT: Open Pre-trained Transformer Language ModelsarXiv:2205.01068

Medical2 results

  • Language ModellingonWikiText-2
    Test perplexity· uses extra data· 2023-01-02
    8.34
    best: 8.21 (SparseGPT (175B, 50% Sparsity))
    SOTA
    SparseGPT: Massive Language Models Can Be Accurately Pruned in One-ShotarXiv:2301.00774
  • Language ModellingonLAMBADA
    Accuracy· 2023-01-02
    75.59
    best: 89.7 (PaLM-540B (Few-Shot))
    SparseGPT: Massive Language Models Can Be Accurately Pruned in One-ShotarXiv:2301.00774