OPT: Open Pre-trained Transformer Language Models

Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, Luke Zettlemoyer

2022-05-02Hate Speech Detection Stereotypical Bias Analysis Language Modelling

Paper PDF Code Code(official)Code Code Code Code Code Code Code Code Code

Abstract

Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning. Given their computational cost, these models are difficult to replicate without significant capital. For the few that are available through APIs, no access is granted to the full model weights, making them difficult to study. We present Open Pre-trained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M to 175B parameters, which we aim to fully and responsibly share with interested researchers. We show that OPT-175B is comparable to GPT-3, while requiring only 1/7th the carbon footprint to develop. We are also releasing our logbook detailing the infrastructure challenges we faced, along with code for experimenting with all of the released models.

Results

Task	Dataset	Metric	Value	Model
Abuse Detection	Ethos Binary	F1-score	0.759	OPT-175B (few-shot)
Abuse Detection	Ethos Binary	F1-score	0.713	OPT-175B (one-shot)
Abuse Detection	Ethos Binary	F1-score	0.667	OPT-175B (zero-shot)
Abuse Detection	Ethos Binary	F1-score	0.628	Davinci (zero-shot)
Abuse Detection	Ethos Binary	F1-score	0.616	Davinci (one-shot)
Abuse Detection	Ethos Binary	F1-score	0.354	Davinci (few-shot)
Hate Speech Detection	Ethos Binary	F1-score	0.759	OPT-175B (few-shot)
Hate Speech Detection	Ethos Binary	F1-score	0.713	OPT-175B (one-shot)
Hate Speech Detection	Ethos Binary	F1-score	0.667	OPT-175B (zero-shot)
Hate Speech Detection	Ethos Binary	F1-score	0.628	Davinci (zero-shot)
Hate Speech Detection	Ethos Binary	F1-score	0.616	Davinci (one-shot)
Hate Speech Detection	Ethos Binary	F1-score	0.354	Davinci (few-shot)
Stereotypical Bias Analysis	CrowS-Pairs	Age	64.4	GPT-3
Stereotypical Bias Analysis	CrowS-Pairs	Disability	76.7	GPT-3
Stereotypical Bias Analysis	CrowS-Pairs	Gender	62.6	GPT-3
Stereotypical Bias Analysis	CrowS-Pairs	Nationality	61.6	GPT-3
Stereotypical Bias Analysis	CrowS-Pairs	Overall	67.2	GPT-3
Stereotypical Bias Analysis	CrowS-Pairs	Physical Appearance	74.6	GPT-3
Stereotypical Bias Analysis	CrowS-Pairs	Race/Color	64.7	GPT-3
Stereotypical Bias Analysis	CrowS-Pairs	Religion	62.6	GPT-3
Stereotypical Bias Analysis	CrowS-Pairs	Sexual Orientation	76.2	GPT-3
Stereotypical Bias Analysis	CrowS-Pairs	Socioeconomic status	73.8	GPT-3
Stereotypical Bias Analysis	CrowS-Pairs	Age	67.8	OPT-175B
Stereotypical Bias Analysis	CrowS-Pairs	Disability	76.7	OPT-175B
Stereotypical Bias Analysis	CrowS-Pairs	Gender	65.7	OPT-175B
Stereotypical Bias Analysis	CrowS-Pairs	Nationality	62.9	OPT-175B
Stereotypical Bias Analysis	CrowS-Pairs	Overall	69.5	OPT-175B
Stereotypical Bias Analysis	CrowS-Pairs	Physical Appearance	76.2	OPT-175B
Stereotypical Bias Analysis	CrowS-Pairs	Race/Color	68.6	OPT-175B
Stereotypical Bias Analysis	CrowS-Pairs	Religion	65.7	OPT-175B
Stereotypical Bias Analysis	CrowS-Pairs	Sexual Orientation	78.6	OPT-175B
Stereotypical Bias Analysis	CrowS-Pairs	Socioeconomic status	76.2	OPT-175B

OPT: Open Pre-trained Transformer Language Models

Abstract

Results

Related Papers

OPT: Open Pre-trained Transformer Language Models

Abstract

Results

Related Papers