LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample

2023-02-27arXiv 2023 2Question Answering Few-Shot Learning Math Word Problem Solving Multi-task Language Understanding Sentence Completion Stereotypical Bias Analysis Common Sense Reasoning Arithmetic Reasoning Code Generation Zero-Shot Learning

Abstract

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

Results

Task	Dataset	Metric	Value	Model
Reading Comprehension	RACE	Accuracy (High)	51.6	LLaMA 65B (zero-shot)
Reading Comprehension	RACE	Accuracy (Middle)	67.9	LLaMA 65B (zero-shot)
Reading Comprehension	RACE	Accuracy (High)	48.3	LLaMA 33B (zero-shot)
Reading Comprehension	RACE	Accuracy (Middle)	64.1	LLaMA 33B (zero-shot)
Reading Comprehension	RACE	Accuracy (High)	47.2	LLaMA 13B (zero-shot)
Reading Comprehension	RACE	Accuracy (Middle)	61.6	LLaMA 13B (zero-shot)
Reading Comprehension	RACE	Accuracy (High)	46.9	LLaMA 7B (zero-shot)
Reading Comprehension	RACE	Accuracy (Middle)	61.1	LLaMA 7B (zero-shot)
Few-Shot Learning	MedConceptsQA	Accuracy	25.653	meta-llama/Meta-Llama-3-8B-Instruct
Zero-Shot Learning	MedConceptsQA	Accuracy	25.84	meta-llama/Meta-Llama-3-8B-Instruct
Transfer Learning	MML	Average (%)	68.9	LLaMA 65B (fine-tuned)
Transfer Learning	MML	Average (%)	63.4	LLaMA 65B (5-shot)
Transfer Learning	MML	Average (%)	57.8	LLaMA 33B (5-shot)
Question Answering	SIQA	Accuracy	52.3	LLaMA 65B (zero-shot)
Question Answering	SIQA	Accuracy	50.4	LLaMA 13B (zero-shot)
Question Answering	SIQA	Accuracy	50.4	LLaMA 33B (zero-shot)
Question Answering	SIQA	Accuracy	48.9	LLaMA 7B (zero-shot)
Question Answering	Natural Questions	EM	39.9	LLaMA 65B (few-shot, k=64)
Question Answering	Natural Questions	EM	35	LLaMA 65B (few-shot, k=5)
Question Answering	Natural Questions	EM	31	LLaMA 65B (one-shot)
Question Answering	Natural Questions	EM	24.9	LLaMA 33B (zero-shot)
Question Answering	OBQA	Accuracy	60.2	LLaMA 65B (zero-shot)
Question Answering	OBQA	Accuracy	58.6	LLaMA 33B (zero-shot)
Question Answering	OBQA	Accuracy	57.2	LLaMA 7B (zero-shot)
Question Answering	OBQA	Accuracy	56.4	LLaMA 13B (zero-shot)
Question Answering	TruthfulQA	% info	53	LLaMA 65B
Question Answering	TruthfulQA	% true	57	LLaMA 65B
Question Answering	TruthfulQA	% info	48	LLaMA 33B
Question Answering	TruthfulQA	% true	52	LLaMA 33B
Question Answering	TruthfulQA	% info	41	LLaMA 13B
Question Answering	TruthfulQA	% true	47	LLaMA 13B
Question Answering	TruthfulQA	% info	29	LLaMA 7B
Question Answering	TruthfulQA	% true	33	LLaMA 7B
Question Answering	PIQA	Accuracy	82.8	LLaMA 65B (0-shot)
Question Answering	PIQA	Accuracy	82.3	LLaMA 33B (0-shot)
Question Answering	PIQA	Accuracy	80.1	LLaMA 13B (0-shot)
Question Answering	PIQA	Accuracy	79.8	LLaMA 7B (0-shot)
Question Answering	TimeQuestions	P@1	17.8	Llama3
Question Answering	BoolQ	Accuracy	85.3	LLaMA 65B (0-shot)
Question Answering	BoolQ	Accuracy	83.1	LLaMA 33B (0-shot)
Question Answering	BoolQ	Accuracy	78.1	LLaMA 13B (zero-shot)
Question Answering	BoolQ	Accuracy	76.5	LLaMA 7B (zero-shot)
Question Answering	TriviaQA	EM	73	LLaMA 65B (few-shot, k=64)
Question Answering	TriviaQA	EM	72.6	LLaMA 65B (few-shot, k=5)
Question Answering	TriviaQA	EM	71.6	LLaMA 65B (one-shot)
Question Answering	TriviaQA	EM	68.2	LLaMA 65B (zero-shot)
Question Answering	MATH	Accuracy	20.5	LLaMA 65B (maj1@k)
Question Answering	MATH	Parameters (Billions)	65	LLaMA 65B (maj1@k)
Question Answering	MATH	Accuracy	15.2	LLaMA 33B-maj1@k
Question Answering	MATH	Parameters (Billions)	33	LLaMA 33B-maj1@k
Question Answering	MATH	Accuracy	10.6	LLaMA 65B
Question Answering	MATH	Parameters (Billions)	65	LLaMA 65B
Question Answering	MATH	Accuracy	8.8	LLaMA 13B-maj1@k
Question Answering	MATH	Parameters (Billions)	13	LLaMA 13B-maj1@k
Question Answering	MATH	Accuracy	7.1	LLaMA 33B
Question Answering	MATH	Parameters (Billions)	33	LLaMA 33B
Question Answering	MATH	Accuracy	6.9	LLaMA 7B-maj1@k
Question Answering	MATH	Parameters (Billions)	7	LLaMA 7B-maj1@k
Question Answering	MATH	Accuracy	3.9	LLaMA 13B
Question Answering	MATH	Parameters (Billions)	13	LLaMA 13B
Question Answering	MATH	Accuracy	2.9	LLaMA 7B
Question Answering	MATH	Parameters (Billions)	7	LLaMA 7B
Code Generation	MBPP	Accuracy	37.7	LLaMA 65B (0-shot)
Code Generation	MBPP	Accuracy	30.2	LLaMA 33B (0-shot)
Code Generation	MBPP	Accuracy	22	LLaMA 13B (0-shot)
Code Generation	MBPP	Accuracy	17.7	LLaMA 7B (0-shot)
Common Sense Reasoning	WinoGrande	Accuracy	77	LLaMA 65B (0-shot)
Common Sense Reasoning	WinoGrande	Accuracy	76	LLaMA 33B (0-shot)
Common Sense Reasoning	WinoGrande	Accuracy	73	LLaMA 13B (0-shot)
Common Sense Reasoning	WinoGrande	Accuracy	70.1	LLaMA 7B (0-shot)
Common Sense Reasoning	ARC (Challenge)	Accuracy	57.8	LLaMA 33B (zero-shot)
Common Sense Reasoning	ARC (Challenge)	Accuracy	56	LLaMA 65B (zero-shot)
Common Sense Reasoning	ARC (Challenge)	Accuracy	52.7	LLaMA 13B (zero-shot)
Common Sense Reasoning	ARC (Challenge)	Accuracy	47.6	LLaMA 7B (zero-shot)
Common Sense Reasoning	ARC (Easy)	Accuracy	80	LLaMA 33B (0-shot)
Common Sense Reasoning	ARC (Easy)	Accuracy	78.9	LLaMA 65B (0-shot)
Common Sense Reasoning	ARC (Easy)	Accuracy	74.8	LLaMA 13B (0-shot)
Common Sense Reasoning	ARC (Easy)	Accuracy	72.8	LLaMA 7B (0-shot)
Math Word Problem Solving	MATH	Accuracy	20.5	LLaMA 65B (maj1@k)
Math Word Problem Solving	MATH	Parameters (Billions)	65	LLaMA 65B (maj1@k)
Math Word Problem Solving	MATH	Accuracy	15.2	LLaMA 33B-maj1@k
Math Word Problem Solving	MATH	Parameters (Billions)	33	LLaMA 33B-maj1@k
Math Word Problem Solving	MATH	Accuracy	10.6	LLaMA 65B
Math Word Problem Solving	MATH	Parameters (Billions)	65	LLaMA 65B
Math Word Problem Solving	MATH	Accuracy	8.8	LLaMA 13B-maj1@k
Math Word Problem Solving	MATH	Parameters (Billions)	13	LLaMA 13B-maj1@k
Math Word Problem Solving	MATH	Accuracy	7.1	LLaMA 33B
Math Word Problem Solving	MATH	Parameters (Billions)	33	LLaMA 33B
Math Word Problem Solving	MATH	Accuracy	6.9	LLaMA 7B-maj1@k
Math Word Problem Solving	MATH	Parameters (Billions)	7	LLaMA 7B-maj1@k
Math Word Problem Solving	MATH	Accuracy	3.9	LLaMA 13B
Math Word Problem Solving	MATH	Parameters (Billions)	13	LLaMA 13B
Math Word Problem Solving	MATH	Accuracy	2.9	LLaMA 7B
Math Word Problem Solving	MATH	Parameters (Billions)	7	LLaMA 7B
Meta-Learning	MedConceptsQA	Accuracy	25.653	meta-llama/Meta-Llama-3-8B-Instruct
Mathematical Question Answering	MATH	Accuracy	20.5	LLaMA 65B (maj1@k)
Mathematical Question Answering	MATH	Parameters (Billions)	65	LLaMA 65B (maj1@k)
Mathematical Question Answering	MATH	Accuracy	15.2	LLaMA 33B-maj1@k
Mathematical Question Answering	MATH	Parameters (Billions)	33	LLaMA 33B-maj1@k
Mathematical Question Answering	MATH	Accuracy	10.6	LLaMA 65B
Mathematical Question Answering	MATH	Parameters (Billions)	65	LLaMA 65B
Mathematical Question Answering	MATH	Accuracy	8.8	LLaMA 13B-maj1@k
Mathematical Question Answering	MATH	Parameters (Billions)	13	LLaMA 13B-maj1@k
Mathematical Question Answering	MATH	Accuracy	7.1	LLaMA 33B
Mathematical Question Answering	MATH	Parameters (Billions)	33	LLaMA 33B
Mathematical Question Answering	MATH	Accuracy	6.9	LLaMA 7B-maj1@k
Mathematical Question Answering	MATH	Parameters (Billions)	7	LLaMA 7B-maj1@k
Mathematical Question Answering	MATH	Accuracy	3.9	LLaMA 13B
Mathematical Question Answering	MATH	Parameters (Billions)	13	LLaMA 13B
Mathematical Question Answering	MATH	Accuracy	2.9	LLaMA 7B
Mathematical Question Answering	MATH	Parameters (Billions)	7	LLaMA 7B
Multi-Task Learning	MML	Average (%)	68.9	LLaMA 65B (fine-tuned)
Multi-Task Learning	MML	Average (%)	63.4	LLaMA 65B (5-shot)
Multi-Task Learning	MML	Average (%)	57.8	LLaMA 33B (5-shot)
Mathematical Reasoning	MATH	Accuracy	20.5	LLaMA 65B (maj1@k)
Mathematical Reasoning	MATH	Parameters (Billions)	65	LLaMA 65B (maj1@k)
Mathematical Reasoning	MATH	Accuracy	15.2	LLaMA 33B-maj1@k
Mathematical Reasoning	MATH	Parameters (Billions)	33	LLaMA 33B-maj1@k
Mathematical Reasoning	MATH	Accuracy	10.6	LLaMA 65B
Mathematical Reasoning	MATH	Parameters (Billions)	65	LLaMA 65B
Mathematical Reasoning	MATH	Accuracy	8.8	LLaMA 13B-maj1@k
Mathematical Reasoning	MATH	Parameters (Billions)	13	LLaMA 13B-maj1@k
Mathematical Reasoning	MATH	Accuracy	7.1	LLaMA 33B
Mathematical Reasoning	MATH	Parameters (Billions)	33	LLaMA 33B
Mathematical Reasoning	MATH	Accuracy	6.9	LLaMA 7B-maj1@k
Mathematical Reasoning	MATH	Parameters (Billions)	7	LLaMA 7B-maj1@k
Mathematical Reasoning	MATH	Accuracy	3.9	LLaMA 13B
Mathematical Reasoning	MATH	Parameters (Billions)	13	LLaMA 13B
Mathematical Reasoning	MATH	Accuracy	2.9	LLaMA 7B
Mathematical Reasoning	MATH	Parameters (Billions)	7	LLaMA 7B
Sentence Completion	HellaSwag	Accuracy	84.2	LLaMA 65B (0-shot)
Sentence Completion	HellaSwag	Accuracy	82.8	LLaMA 33B (0-shot)
Sentence Completion	HellaSwag	Accuracy	79.2	LLaMA 13B (0-shot)
Sentence Completion	HellaSwag	Accuracy	76.1	LLaMA 7B (0-shot)
Stereotypical Bias Analysis	CrowS-Pairs	Age	70.1	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Disability	66.7	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Gender	70.6	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Nationality	64.2	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Overall	66.6	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Physical Appearance	77.8	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Race/Color	57	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Religion	70.6	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Sexual Orientation	81	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Socioeconomic status	71.5	LLaMA 65B
Arithmetic Reasoning	GSM8K	Accuracy	69.7	LLaMA 65B-maj1@k
Arithmetic Reasoning	GSM8K	Parameters (Billion)	65	LLaMA 65B-maj1@k
Arithmetic Reasoning	GSM8K	Accuracy	53.1	LLaMA 33B-maj1@k
Arithmetic Reasoning	GSM8K	Parameters (Billion)	33	LLaMA 33B-maj1@k
Arithmetic Reasoning	GSM8K	Accuracy	50.9	LLaMA 65B
Arithmetic Reasoning	GSM8K	Parameters (Billion)	65	LLaMA 65B
Arithmetic Reasoning	GSM8K	Accuracy	35.6	LLaMA 33B
Arithmetic Reasoning	GSM8K	Parameters (Billion)	33	LLaMA 33B
Arithmetic Reasoning	GSM8K	Accuracy	29.3	LLaMA 13B-maj1@k
Arithmetic Reasoning	GSM8K	Parameters (Billion)	13	LLaMA 13B-maj1@k
Arithmetic Reasoning	GSM8K	Accuracy	18.1	LLaMA 7B (maj1@k)
Arithmetic Reasoning	GSM8K	Parameters (Billion)	7	LLaMA 7B (maj1@k)
Arithmetic Reasoning	GSM8K	Accuracy	17.8	LLaMA 13B
Arithmetic Reasoning	GSM8K	Parameters (Billion)	13	LLaMA 13B
Arithmetic Reasoning	GSM8K	Accuracy	11	LLaMA 7B
Arithmetic Reasoning	GSM8K	Parameters (Billion)	7	LLaMA 7B

Abstract

Results

Task	Dataset	Metric	Value	Model
Reading Comprehension	RACE	Accuracy (High)	51.6	LLaMA 65B (zero-shot)
Reading Comprehension	RACE	Accuracy (Middle)	67.9	LLaMA 65B (zero-shot)
Reading Comprehension	RACE	Accuracy (High)	48.3	LLaMA 33B (zero-shot)
Reading Comprehension	RACE	Accuracy (Middle)	64.1	LLaMA 33B (zero-shot)
Reading Comprehension	RACE	Accuracy (High)	47.2	LLaMA 13B (zero-shot)
Reading Comprehension	RACE	Accuracy (Middle)	61.6	LLaMA 13B (zero-shot)
Reading Comprehension	RACE	Accuracy (High)	46.9	LLaMA 7B (zero-shot)
Reading Comprehension	RACE	Accuracy (Middle)	61.1	LLaMA 7B (zero-shot)
Few-Shot Learning	MedConceptsQA	Accuracy	25.653	meta-llama/Meta-Llama-3-8B-Instruct
Zero-Shot Learning	MedConceptsQA	Accuracy	25.84	meta-llama/Meta-Llama-3-8B-Instruct
Transfer Learning	MML	Average (%)	68.9	LLaMA 65B (fine-tuned)
Transfer Learning	MML	Average (%)	63.4	LLaMA 65B (5-shot)
Transfer Learning	MML	Average (%)	57.8	LLaMA 33B (5-shot)
Question Answering	SIQA	Accuracy	52.3	LLaMA 65B (zero-shot)
Question Answering	SIQA	Accuracy	50.4	LLaMA 13B (zero-shot)
Question Answering	SIQA	Accuracy	50.4	LLaMA 33B (zero-shot)
Question Answering	SIQA	Accuracy	48.9	LLaMA 7B (zero-shot)
Question Answering	Natural Questions	EM	39.9	LLaMA 65B (few-shot, k=64)
Question Answering	Natural Questions	EM	35	LLaMA 65B (few-shot, k=5)
Question Answering	Natural Questions	EM	31	LLaMA 65B (one-shot)
Question Answering	Natural Questions	EM	24.9	LLaMA 33B (zero-shot)
Question Answering	OBQA	Accuracy	60.2	LLaMA 65B (zero-shot)
Question Answering	OBQA	Accuracy	58.6	LLaMA 33B (zero-shot)
Question Answering	OBQA	Accuracy	57.2	LLaMA 7B (zero-shot)
Question Answering	OBQA	Accuracy	56.4	LLaMA 13B (zero-shot)
Question Answering	TruthfulQA	% info	53	LLaMA 65B
Question Answering	TruthfulQA	% true	57	LLaMA 65B
Question Answering	TruthfulQA	% info	48	LLaMA 33B
Question Answering	TruthfulQA	% true	52	LLaMA 33B
Question Answering	TruthfulQA	% info	41	LLaMA 13B
Question Answering	TruthfulQA	% true	47	LLaMA 13B
Question Answering	TruthfulQA	% info	29	LLaMA 7B
Question Answering	TruthfulQA	% true	33	LLaMA 7B
Question Answering	PIQA	Accuracy	82.8	LLaMA 65B (0-shot)
Question Answering	PIQA	Accuracy	82.3	LLaMA 33B (0-shot)
Question Answering	PIQA	Accuracy	80.1	LLaMA 13B (0-shot)
Question Answering	PIQA	Accuracy	79.8	LLaMA 7B (0-shot)
Question Answering	TimeQuestions	P@1	17.8	Llama3
Question Answering	BoolQ	Accuracy	85.3	LLaMA 65B (0-shot)
Question Answering	BoolQ	Accuracy	83.1	LLaMA 33B (0-shot)
Question Answering	BoolQ	Accuracy	78.1	LLaMA 13B (zero-shot)
Question Answering	BoolQ	Accuracy	76.5	LLaMA 7B (zero-shot)
Question Answering	TriviaQA	EM	73	LLaMA 65B (few-shot, k=64)
Question Answering	TriviaQA	EM	72.6	LLaMA 65B (few-shot, k=5)
Question Answering	TriviaQA	EM	71.6	LLaMA 65B (one-shot)
Question Answering	TriviaQA	EM	68.2	LLaMA 65B (zero-shot)
Question Answering	MATH	Accuracy	20.5	LLaMA 65B (maj1@k)
Question Answering	MATH	Parameters (Billions)	65	LLaMA 65B (maj1@k)
Question Answering	MATH	Accuracy	15.2	LLaMA 33B-maj1@k
Question Answering	MATH	Parameters (Billions)	33	LLaMA 33B-maj1@k
Question Answering	MATH	Accuracy	10.6	LLaMA 65B
Question Answering	MATH	Parameters (Billions)	65	LLaMA 65B
Question Answering	MATH	Accuracy	8.8	LLaMA 13B-maj1@k
Question Answering	MATH	Parameters (Billions)	13	LLaMA 13B-maj1@k
Question Answering	MATH	Accuracy	7.1	LLaMA 33B
Question Answering	MATH	Parameters (Billions)	33	LLaMA 33B
Question Answering	MATH	Accuracy	6.9	LLaMA 7B-maj1@k
Question Answering	MATH	Parameters (Billions)	7	LLaMA 7B-maj1@k
Question Answering	MATH	Accuracy	3.9	LLaMA 13B
Question Answering	MATH	Parameters (Billions)	13	LLaMA 13B
Question Answering	MATH	Accuracy	2.9	LLaMA 7B
Question Answering	MATH	Parameters (Billions)	7	LLaMA 7B
Code Generation	MBPP	Accuracy	37.7	LLaMA 65B (0-shot)
Code Generation	MBPP	Accuracy	30.2	LLaMA 33B (0-shot)
Code Generation	MBPP	Accuracy	22	LLaMA 13B (0-shot)
Code Generation	MBPP	Accuracy	17.7	LLaMA 7B (0-shot)
Common Sense Reasoning	WinoGrande	Accuracy	77	LLaMA 65B (0-shot)
Common Sense Reasoning	WinoGrande	Accuracy	76	LLaMA 33B (0-shot)
Common Sense Reasoning	WinoGrande	Accuracy	73	LLaMA 13B (0-shot)
Common Sense Reasoning	WinoGrande	Accuracy	70.1	LLaMA 7B (0-shot)
Common Sense Reasoning	ARC (Challenge)	Accuracy	57.8	LLaMA 33B (zero-shot)
Common Sense Reasoning	ARC (Challenge)	Accuracy	56	LLaMA 65B (zero-shot)
Common Sense Reasoning	ARC (Challenge)	Accuracy	52.7	LLaMA 13B (zero-shot)
Common Sense Reasoning	ARC (Challenge)	Accuracy	47.6	LLaMA 7B (zero-shot)
Common Sense Reasoning	ARC (Easy)	Accuracy	80	LLaMA 33B (0-shot)
Common Sense Reasoning	ARC (Easy)	Accuracy	78.9	LLaMA 65B (0-shot)
Common Sense Reasoning	ARC (Easy)	Accuracy	74.8	LLaMA 13B (0-shot)
Common Sense Reasoning	ARC (Easy)	Accuracy	72.8	LLaMA 7B (0-shot)
Math Word Problem Solving	MATH	Accuracy	20.5	LLaMA 65B (maj1@k)
Math Word Problem Solving	MATH	Parameters (Billions)	65	LLaMA 65B (maj1@k)
Math Word Problem Solving	MATH	Accuracy	15.2	LLaMA 33B-maj1@k
Math Word Problem Solving	MATH	Parameters (Billions)	33	LLaMA 33B-maj1@k
Math Word Problem Solving	MATH	Accuracy	10.6	LLaMA 65B
Math Word Problem Solving	MATH	Parameters (Billions)	65	LLaMA 65B
Math Word Problem Solving	MATH	Accuracy	8.8	LLaMA 13B-maj1@k
Math Word Problem Solving	MATH	Parameters (Billions)	13	LLaMA 13B-maj1@k
Math Word Problem Solving	MATH	Accuracy	7.1	LLaMA 33B
Math Word Problem Solving	MATH	Parameters (Billions)	33	LLaMA 33B
Math Word Problem Solving	MATH	Accuracy	6.9	LLaMA 7B-maj1@k
Math Word Problem Solving	MATH	Parameters (Billions)	7	LLaMA 7B-maj1@k
Math Word Problem Solving	MATH	Accuracy	3.9	LLaMA 13B
Math Word Problem Solving	MATH	Parameters (Billions)	13	LLaMA 13B
Math Word Problem Solving	MATH	Accuracy	2.9	LLaMA 7B
Math Word Problem Solving	MATH	Parameters (Billions)	7	LLaMA 7B
Meta-Learning	MedConceptsQA	Accuracy	25.653	meta-llama/Meta-Llama-3-8B-Instruct
Mathematical Question Answering	MATH	Accuracy	20.5	LLaMA 65B (maj1@k)
Mathematical Question Answering	MATH	Parameters (Billions)	65	LLaMA 65B (maj1@k)
Mathematical Question Answering	MATH	Accuracy	15.2	LLaMA 33B-maj1@k
Mathematical Question Answering	MATH	Parameters (Billions)	33	LLaMA 33B-maj1@k
Mathematical Question Answering	MATH	Accuracy	10.6	LLaMA 65B
Mathematical Question Answering	MATH	Parameters (Billions)	65	LLaMA 65B
Mathematical Question Answering	MATH	Accuracy	8.8	LLaMA 13B-maj1@k
Mathematical Question Answering	MATH	Parameters (Billions)	13	LLaMA 13B-maj1@k
Mathematical Question Answering	MATH	Accuracy	7.1	LLaMA 33B
Mathematical Question Answering	MATH	Parameters (Billions)	33	LLaMA 33B
Mathematical Question Answering	MATH	Accuracy	6.9	LLaMA 7B-maj1@k
Mathematical Question Answering	MATH	Parameters (Billions)	7	LLaMA 7B-maj1@k
Mathematical Question Answering	MATH	Accuracy	3.9	LLaMA 13B
Mathematical Question Answering	MATH	Parameters (Billions)	13	LLaMA 13B
Mathematical Question Answering	MATH	Accuracy	2.9	LLaMA 7B
Mathematical Question Answering	MATH	Parameters (Billions)	7	LLaMA 7B
Multi-Task Learning	MML	Average (%)	68.9	LLaMA 65B (fine-tuned)
Multi-Task Learning	MML	Average (%)	63.4	LLaMA 65B (5-shot)
Multi-Task Learning	MML	Average (%)	57.8	LLaMA 33B (5-shot)
Mathematical Reasoning	MATH	Accuracy	20.5	LLaMA 65B (maj1@k)
Mathematical Reasoning	MATH	Parameters (Billions)	65	LLaMA 65B (maj1@k)
Mathematical Reasoning	MATH	Accuracy	15.2	LLaMA 33B-maj1@k
Mathematical Reasoning	MATH	Parameters (Billions)	33	LLaMA 33B-maj1@k
Mathematical Reasoning	MATH	Accuracy	10.6	LLaMA 65B
Mathematical Reasoning	MATH	Parameters (Billions)	65	LLaMA 65B
Mathematical Reasoning	MATH	Accuracy	8.8	LLaMA 13B-maj1@k
Mathematical Reasoning	MATH	Parameters (Billions)	13	LLaMA 13B-maj1@k
Mathematical Reasoning	MATH	Accuracy	7.1	LLaMA 33B
Mathematical Reasoning	MATH	Parameters (Billions)	33	LLaMA 33B
Mathematical Reasoning	MATH	Accuracy	6.9	LLaMA 7B-maj1@k
Mathematical Reasoning	MATH	Parameters (Billions)	7	LLaMA 7B-maj1@k
Mathematical Reasoning	MATH	Accuracy	3.9	LLaMA 13B
Mathematical Reasoning	MATH	Parameters (Billions)	13	LLaMA 13B
Mathematical Reasoning	MATH	Accuracy	2.9	LLaMA 7B
Mathematical Reasoning	MATH	Parameters (Billions)	7	LLaMA 7B
Sentence Completion	HellaSwag	Accuracy	84.2	LLaMA 65B (0-shot)
Sentence Completion	HellaSwag	Accuracy	82.8	LLaMA 33B (0-shot)
Sentence Completion	HellaSwag	Accuracy	79.2	LLaMA 13B (0-shot)
Sentence Completion	HellaSwag	Accuracy	76.1	LLaMA 7B (0-shot)
Stereotypical Bias Analysis	CrowS-Pairs	Age	70.1	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Disability	66.7	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Gender	70.6	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Nationality	64.2	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Overall	66.6	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Physical Appearance	77.8	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Race/Color	57	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Religion	70.6	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Sexual Orientation	81	LLaMA 65B
Stereotypical Bias Analysis	CrowS-Pairs	Socioeconomic status	71.5	LLaMA 65B
Arithmetic Reasoning	GSM8K	Accuracy	69.7	LLaMA 65B-maj1@k
Arithmetic Reasoning	GSM8K	Parameters (Billion)	65	LLaMA 65B-maj1@k
Arithmetic Reasoning	GSM8K	Accuracy	53.1	LLaMA 33B-maj1@k
Arithmetic Reasoning	GSM8K	Parameters (Billion)	33	LLaMA 33B-maj1@k
Arithmetic Reasoning	GSM8K	Accuracy	50.9	LLaMA 65B
Arithmetic Reasoning	GSM8K	Parameters (Billion)	65	LLaMA 65B
Arithmetic Reasoning	GSM8K	Accuracy	35.6	LLaMA 33B
Arithmetic Reasoning	GSM8K	Parameters (Billion)	33	LLaMA 33B
Arithmetic Reasoning	GSM8K	Accuracy	29.3	LLaMA 13B-maj1@k
Arithmetic Reasoning	GSM8K	Parameters (Billion)	13	LLaMA 13B-maj1@k
Arithmetic Reasoning	GSM8K	Accuracy	18.1	LLaMA 7B (maj1@k)
Arithmetic Reasoning	GSM8K	Parameters (Billion)	7	LLaMA 7B (maj1@k)
Arithmetic Reasoning	GSM8K	Accuracy	17.8	LLaMA 13B
Arithmetic Reasoning	GSM8K	Parameters (Billion)	13	LLaMA 13B
Arithmetic Reasoning	GSM8K	Accuracy	11	LLaMA 7B
Arithmetic Reasoning	GSM8K	Parameters (Billion)	7	LLaMA 7B

LLaMA: Open and Efficient Foundation Language Models

Abstract

Results

Related Papers

LLaMA: Open and Efficient Foundation Language Models

Abstract

Results

Related Papers