TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/OPT: Open Pre-trained Transformer Language Models

OPT: Open Pre-trained Transformer Language Models

Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, Luke Zettlemoyer

2022-05-02Hate Speech DetectionStereotypical Bias AnalysisLanguage Modelling
PaperPDFCodeCode(official)CodeCodeCodeCodeCodeCodeCodeCodeCode

Abstract

Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning. Given their computational cost, these models are difficult to replicate without significant capital. For the few that are available through APIs, no access is granted to the full model weights, making them difficult to study. We present Open Pre-trained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M to 175B parameters, which we aim to fully and responsibly share with interested researchers. We show that OPT-175B is comparable to GPT-3, while requiring only 1/7th the carbon footprint to develop. We are also releasing our logbook detailing the infrastructure challenges we faced, along with code for experimenting with all of the released models.

Results

TaskDatasetMetricValueModel
Abuse DetectionEthos BinaryF1-score0.759OPT-175B (few-shot)
Abuse DetectionEthos BinaryF1-score0.713OPT-175B (one-shot)
Abuse DetectionEthos BinaryF1-score0.667OPT-175B (zero-shot)
Abuse DetectionEthos BinaryF1-score0.628Davinci (zero-shot)
Abuse DetectionEthos BinaryF1-score0.616Davinci (one-shot)
Abuse DetectionEthos BinaryF1-score0.354Davinci (few-shot)
Hate Speech DetectionEthos BinaryF1-score0.759OPT-175B (few-shot)
Hate Speech DetectionEthos BinaryF1-score0.713OPT-175B (one-shot)
Hate Speech DetectionEthos BinaryF1-score0.667OPT-175B (zero-shot)
Hate Speech DetectionEthos BinaryF1-score0.628Davinci (zero-shot)
Hate Speech DetectionEthos BinaryF1-score0.616Davinci (one-shot)
Hate Speech DetectionEthos BinaryF1-score0.354Davinci (few-shot)
Stereotypical Bias AnalysisCrowS-PairsAge64.4GPT-3
Stereotypical Bias AnalysisCrowS-PairsDisability76.7GPT-3
Stereotypical Bias AnalysisCrowS-PairsGender62.6GPT-3
Stereotypical Bias AnalysisCrowS-PairsNationality61.6GPT-3
Stereotypical Bias AnalysisCrowS-PairsOverall67.2GPT-3
Stereotypical Bias AnalysisCrowS-PairsPhysical Appearance74.6GPT-3
Stereotypical Bias AnalysisCrowS-PairsRace/Color64.7GPT-3
Stereotypical Bias AnalysisCrowS-PairsReligion62.6GPT-3
Stereotypical Bias AnalysisCrowS-PairsSexual Orientation76.2GPT-3
Stereotypical Bias AnalysisCrowS-PairsSocioeconomic status73.8GPT-3
Stereotypical Bias AnalysisCrowS-PairsAge67.8OPT-175B
Stereotypical Bias AnalysisCrowS-PairsDisability76.7OPT-175B
Stereotypical Bias AnalysisCrowS-PairsGender65.7OPT-175B
Stereotypical Bias AnalysisCrowS-PairsNationality62.9OPT-175B
Stereotypical Bias AnalysisCrowS-PairsOverall69.5OPT-175B
Stereotypical Bias AnalysisCrowS-PairsPhysical Appearance76.2OPT-175B
Stereotypical Bias AnalysisCrowS-PairsRace/Color68.6OPT-175B
Stereotypical Bias AnalysisCrowS-PairsReligion65.7OPT-175B
Stereotypical Bias AnalysisCrowS-PairsSexual Orientation78.6OPT-175B
Stereotypical Bias AnalysisCrowS-PairsSocioeconomic status76.2OPT-175B

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Assay2Mol: large language model-based drug design using BioAssay context2025-07-16Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing2025-07-16