Ross Taylor, Marcin Kardas, Guillem Cucurull, Thomas Scialom, Anthony Hartshorn, Elvis Saravia, Andrew Poulton, Viktor Kerkez, Robert Stojnic
Information overload is a major obstacle to scientific progress. The explosive growth in scientific literature and data has made it ever harder to discover useful insights in a large mass of information. Today scientific knowledge is accessed through search engines, but they are unable to organize scientific knowledge alone. In this paper we introduce Galactica: a large language model that can store, combine and reason about scientific knowledge. We train on a large scientific corpus of papers, reference material, knowledge bases and many other sources. We outperform existing models on a range of scientific tasks. On technical knowledge probes such as LaTeX equations, Galactica outperforms the latest GPT-3 by 68.2% versus 49.0%. Galactica also performs well on reasoning, outperforming Chinchilla on mathematical MMLU by 41.3% to 35.7%, and PaLM 540B on MATH with a score of 20.4% versus 8.8%. It also sets a new state-of-the-art on downstream tasks such as PubMedQA and MedMCQA dev of 77.6% and 52.9%. And despite not being trained on a general corpus, Galactica outperforms BLOOM and OPT-175B on BIG-bench. We believe these results demonstrate the potential for language models as a new interface for science. We open source the model for the benefit of the scientific community.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Transfer Learning | MML | Average (%) | 52.6 | GAL 120B (zero-shot) |
| Question Answering | PubMedQA | Accuracy | 77.6 | GAL 120B (zero-shot) |
| Question Answering | PubMedQA | Accuracy | 73.6 | BLOOM (zero-shot) |
| Question Answering | PubMedQA | Accuracy | 70.2 | OPT (zero-shot) |
| Question Answering | MedQA | Accuracy | 44.4 | GAL 120B (zero-shot) |
| Question Answering | MedQA | Accuracy | 23.3 | BLOOM (few-shot, k=5) |
| Question Answering | MedQA | Accuracy | 22.8 | OPT (few-shot, k=5) |
| Question Answering | BioASQ | Accuracy | 94.3 | GAL 120B (zero-shot) |
| Question Answering | BioASQ | Accuracy | 91.4 | BLOOM (zero-shot) |
| Question Answering | BioASQ | Accuracy | 81.4 | OPT (zero-shot) |
| Question Answering | TruthfulQA | MC1 | 0.26 | GAL 120B |
| Question Answering | TruthfulQA | MC1 | 0.24 | GAL 30B |
| Question Answering | TruthfulQA | MC1 | 0.21 | OPT 175B |
| Question Answering | TruthfulQA | MC1 | 0.19 | GAL 125M |
| Question Answering | TruthfulQA | MC1 | 0.19 | GAL 1.3B |
| Question Answering | TruthfulQA | MC1 | 0.19 | GAL 6.7B |
| Question Answering | MMLU (Econometrics) | Accuracy | 43 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (Econometrics) | Accuracy | 42.1 | GAL 120B (zero-shot) |
| Question Answering | MMLU (Econometrics) | Accuracy | 38.6 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (Econometrics) | Accuracy | 23.7 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (Econometrics) | Accuracy | 21 | OPT (few-shot, k=5) |
| Question Answering | MMLU (College Biology) | Accuracy | 79.9 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (College Biology) | Accuracy | 70.8 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (College Biology) | Accuracy | 68.8 | GAL 120B (zero-shot) |
| Question Answering | MMLU (College Biology) | Accuracy | 30.6 | OPT (few-shot, k=5) |
| Question Answering | MMLU (College Biology) | Accuracy | 28.5 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (Machine Learning) | Accuracy | 41.1 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (Machine Learning) | Accuracy | 38.4 | GAL 120B (zero-shot) |
| Question Answering | MMLU (Machine Learning) | Accuracy | 28.6 | OPT (few-shot, k=5) |
| Question Answering | MMLU (Machine Learning) | Accuracy | 25 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (High School Physics) | Accuracy | 36.4 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (High School Physics) | Accuracy | 33.8 | GAL 120B (zero-shot) |
| Question Answering | MMLU (High School Physics) | Accuracy | 29.8 | OPT (few-shot, k=5) |
| Question Answering | MMLU (High School Physics) | Accuracy | 25.2 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (Medical Genetics) | Accuracy | 70 | GAL 30B (zero-shot) |
| Question Answering | MMLU (Medical Genetics) | Accuracy | 69 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (Medical Genetics) | Accuracy | 68 | GAL 120B (zero-shot) |
| Question Answering | MMLU (Medical Genetics) | Accuracy | 36 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (Medical Genetics) | Accuracy | 35 | OPT (few-shot, k=5) |
| Question Answering | MMLU (High School Computer Science) | Accuracy | 70 | GAL 120B (zero-shot) |
| Question Answering | MMLU (High School Computer Science) | Accuracy | 58 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (High School Computer Science) | Accuracy | 54 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (High School Computer Science) | Accuracy | 30 | OPT (few-shot, k=5) |
| Question Answering | MMLU (High School Computer Science) | Accuracy | 25 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (College Chemistry) | Accuracy | 51 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (College Chemistry) | Accuracy | 46 | GAL 120B (zero-shot) |
| Question Answering | MMLU (College Chemistry) | Accuracy | 45 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (College Chemistry) | Accuracy | 30 | OPT (few-shot, k=5) |
| Question Answering | MMLU (College Chemistry) | Accuracy | 19 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (College Computer Science) | Accuracy | 51 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (College Computer Science) | Accuracy | 49 | GAL 120B (zero-shot) |
| Question Answering | MMLU (College Computer Science) | Accuracy | 17 | OPT (few-shot, k=5) |
| Question Answering | MMLU (College Computer Science) | Accuracy | 6 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (Astronomy) | Accuracy | 73 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (Astronomy) | Accuracy | 65.8 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (Astronomy) | Accuracy | 65.1 | GAL 120B (zero-shot) |
| Question Answering | MMLU (Astronomy) | Accuracy | 25.7 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (Astronomy) | Accuracy | 23 | OPT (few-shot, k=5) |
| Question Answering | MMLU (Electrical Engineer) | Accuracy | 62.8 | GAL 120B (zero-shot) |
| Question Answering | MMLU (Electrical Engineer) | Accuracy | 62.1 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (Electrical Engineer) | Accuracy | 60 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (Electrical Engineer) | Accuracy | 36.6 | OPT (few-shot, k=5) |
| Question Answering | MMLU (Electrical Engineer) | Accuracy | 32.4 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (Formal Logic) | Accuracy | 35.7 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (Formal Logic) | Accuracy | 33.3 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (Formal Logic) | Accuracy | 32.5 | GAL 120B (zero-shot) |
| Question Answering | MMLU (Formal Logic) | Accuracy | 29.4 | OPT (few-shot, k=5) |
| Question Answering | MMLU (Formal Logic) | Accuracy | 26.2 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (High School Biology) | Accuracy | 80.3 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (High School Biology) | Accuracy | 71.3 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (High School Biology) | Accuracy | 69.4 | GAL 120B (zero-shot) |
| Question Answering | MMLU (High School Biology) | Accuracy | 29.4 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (High School Biology) | Accuracy | 27.7 | OPT (few-shot, k=5) |
| Question Answering | MMLU (High School Mathematics) | Accuracy | 32.6 | GAL 120B (zero-shot) |
| Question Answering | MMLU (High School Mathematics) | Accuracy | 31.9 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (High School Mathematics) | Accuracy | 27 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (High School Mathematics) | Accuracy | 24.4 | OPT (few-shot, k=5) |
| Question Answering | MMLU (High School Mathematics) | Accuracy | 23.7 | Gopher (few-shot, k=5) |
| Question Answering | MedMCQA | Dev Set (Acc-%) | 0.529 | GAL 120B (zero-shot) |
| Question Answering | MedMCQA | Dev Set (Acc-%) | 0.325 | BLOOM (few-shot, k=5) |
| Question Answering | MedMCQA | Dev Set (Acc-%) | 0.296 | OPT (few-shot, k=5) |
| Question Answering | MMLU (High School Chemistry) | Accuracy | 58.1 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (High School Chemistry) | Accuracy | 47.8 | GAL 120B (zero-shot) |
| Question Answering | MMLU (High School Chemistry) | Accuracy | 23.2 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (High School Chemistry) | Accuracy | 21.7 | OPT (few-shot, k=5) |
| Question Answering | MMLU (Elementary Mathematics) | Accuracy | 41.5 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (Elementary Mathematics) | Accuracy | 38.1 | GAL 120B (zero-shot) |
| Question Answering | MMLU (Elementary Mathematics) | Accuracy | 33.6 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (Elementary Mathematics) | Accuracy | 27.6 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (Elementary Mathematics) | Accuracy | 25.7 | OPT (few-shot, k=5) |
| Question Answering | MMLU (Abstract Algebra) | Accuracy | 33.3 | GAL 30B (zero-shot) |
| Question Answering | MMLU (Abstract Algebra) | Accuracy | 31 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (Abstract Algebra) | Accuracy | 27 | GAL 120B (zero-shot) |
| Question Answering | MMLU (Abstract Algebra) | Accuracy | 25 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (Abstract Algebra) | Accuracy | 21 | OPT (few-shot, k=5) |
| Question Answering | MMLU (High School Statistics) | Accuracy | 58.8 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (High School Statistics) | Accuracy | 50 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (High School Statistics) | Accuracy | 43.5 | OPT (few-shot, k=5) |
| Question Answering | MMLU (High School Statistics) | Accuracy | 41.2 | GAL 120B (zero-shot) |
| Question Answering | MMLU (High School Statistics) | Accuracy | 19.4 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (College Physics) | Accuracy | 46.1 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (College Physics) | Accuracy | 42.2 | GAL 120B (zero-shot) |
| Question Answering | MMLU (College Physics) | Accuracy | 34.3 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (College Physics) | Accuracy | 21.6 | OPT (few-shot, k=5) |
| Question Answering | MMLU (College Physics) | Accuracy | 18.6 | BLOOM (few-shot, k=5) |
| Question Answering | MMLU (College Mathematics) | Accuracy | 43 | GAL 120B (zero-shot) |
| Question Answering | MMLU (College Mathematics) | Accuracy | 37 | Gopher (few-shot, k=5) |
| Question Answering | MMLU (College Mathematics) | Accuracy | 33 | OPT (few-shot, k=5) |
| Question Answering | MMLU (College Mathematics) | Accuracy | 32 | Chinchilla (few-shot, k=5) |
| Question Answering | MMLU (College Mathematics) | Accuracy | 25 | BLOOM (few-shot, k=5) |
| Question Answering | MATH | Accuracy | 33.6 | Minerva 540B (5-shot) mCoT |
| Question Answering | MATH | Parameters (Billions) | 540 | Minerva 540B (5-shot) mCoT |
| Question Answering | MATH | Accuracy | 20.4 | GAL 120B (5-shot) mCoT |
| Question Answering | MATH | Parameters (Billions) | 120 | GAL 120B (5-shot) mCoT |
| Question Answering | MATH | Accuracy | 16.6 | GAL 120B <work> |
| Question Answering | MATH | Parameters (Billions) | 120 | GAL 120B <work> |
| Question Answering | MATH | Accuracy | 12.7 | GAL 30B (5-shot) mCoT |
| Question Answering | MATH | Parameters (Billions) | 30 | GAL 30B (5-shot) mCoT |
| Question Answering | MATH | Accuracy | 11.4 | GAL 30B <work> |
| Question Answering | MATH | Parameters (Billions) | 30 | GAL 30B <work> |
| Question Answering | MATH | Accuracy | 8.8 | PaLM 540B (5-shot) mCoT |
| Question Answering | MATH | Parameters (Billions) | 540 | PaLM 540B (5-shot) mCoT |
| Question Answering | MATH | Accuracy | 5.2 | GPT-3 175B (8-shot) |
| Question Answering | MATH | Parameters (Billions) | 175 | GPT-3 175B (8-shot) |
| Common Sense Reasoning | ARC (Challenge) | Accuracy | 67.9 | GAL 120B (zero-shot) |
| Common Sense Reasoning | ARC (Challenge) | Accuracy | 51.4 | GPT-3 (zero-shot) |
| Common Sense Reasoning | ARC (Challenge) | Accuracy | 32.9 | BLOOM (few-shot, k=5) |
| Common Sense Reasoning | ARC (Challenge) | Accuracy | 31.1 | OPT (few-shot, k=5) |
| Common Sense Reasoning | ARC (Easy) | Accuracy | 83.8 | GAL 120B (0-shot) |
| Common Sense Reasoning | ARC (Easy) | Accuracy | 68.8 | GPT-3 (zero-shot) |
| Common Sense Reasoning | ARC (Easy) | Accuracy | 40.7 | BLOOM (5-shot) |
| Common Sense Reasoning | ARC (Easy) | Accuracy | 37.4 | OPT (5-shot) |
| Word Sense Disambiguation | BIG-bench (Anachronisms) | Accuracy | 49.1 | OPT 175B |
| Word Sense Disambiguation | BIG-bench (Anachronisms) | Accuracy | 48.7 | GAL 120B (few-shot, k=5) |
| Word Sense Disambiguation | BIG-bench (Anachronisms) | Accuracy | 47 | GAL 30B (few-shot, k=5) |
| Word Sense Disambiguation | BIG-bench (Anachronisms) | Accuracy | 1.3 | BLOOM 176B |
| Drug Discovery | tdcommons | TDC.BBB_Martins | 0.661 | Galactica-GAL-120B |
| Drug Discovery | tdcommons | TDC.BBB_Martins | 0.604 | Galactica-GAL-1.3B |
| Drug Discovery | tdcommons | TDC.BBB_Martins | 0.596 | Galactica-GAL-30B |
| Drug Discovery | tdcommons | TDC.BBB_Martins | 0.535 | Galactica-GAL-6.7B |
| Drug Discovery | tdcommons | TDC.BBB_Martins | 0.393 | Galactica-GAL-125M |
| Math Word Problem Solving | MATH | Accuracy | 33.6 | Minerva 540B (5-shot) mCoT |
| Math Word Problem Solving | MATH | Parameters (Billions) | 540 | Minerva 540B (5-shot) mCoT |
| Math Word Problem Solving | MATH | Accuracy | 20.4 | GAL 120B (5-shot) mCoT |
| Math Word Problem Solving | MATH | Parameters (Billions) | 120 | GAL 120B (5-shot) mCoT |
| Math Word Problem Solving | MATH | Accuracy | 16.6 | GAL 120B <work> |
| Math Word Problem Solving | MATH | Parameters (Billions) | 120 | GAL 120B <work> |
| Math Word Problem Solving | MATH | Accuracy | 12.7 | GAL 30B (5-shot) mCoT |
| Math Word Problem Solving | MATH | Parameters (Billions) | 30 | GAL 30B (5-shot) mCoT |
| Math Word Problem Solving | MATH | Accuracy | 11.4 | GAL 30B <work> |
| Math Word Problem Solving | MATH | Parameters (Billions) | 30 | GAL 30B <work> |
| Math Word Problem Solving | MATH | Accuracy | 8.8 | PaLM 540B (5-shot) mCoT |
| Math Word Problem Solving | MATH | Parameters (Billions) | 540 | PaLM 540B (5-shot) mCoT |
| Math Word Problem Solving | MATH | Accuracy | 5.2 | GPT-3 175B (8-shot) |
| Math Word Problem Solving | MATH | Parameters (Billions) | 175 | GPT-3 175B (8-shot) |
| Molecular Property Prediction | clintox | Molecules (M) | 2 | GAL 120B |
| Molecular Property Prediction | clintox | ROC-AUC | 82.6 | GAL 120B |
| Molecular Property Prediction | clintox | Molecules (M) | 2 | GAL 30B |
| Molecular Property Prediction | clintox | ROC-AUC | 82.2 | GAL 30B |
| Molecular Property Prediction | clintox | Molecules (M) | 2 | GAL 6.7B |
| Molecular Property Prediction | clintox | ROC-AUC | 78.4 | GAL 6.7B |
| Molecular Property Prediction | clintox | Molecules (M) | 2 | GAL 1.3B |
| Molecular Property Prediction | clintox | ROC-AUC | 58.9 | GAL 1.3B |
| Molecular Property Prediction | clintox | Molecules (M) | 2 | GAL 125M |
| Molecular Property Prediction | clintox | ROC-AUC | 51.8 | GAL 125M |
| Molecular Property Prediction | MoleculeNet | AUC | 0.77 | Uni-Mol |
| Molecular Property Prediction | MoleculeNet | AUC | 0.69 | GAL 30B |
| Molecular Property Prediction | MoleculeNet | AUC | 0.64 | GAL 6.7B |
| Molecular Property Prediction | MoleculeNet | AUC | 0.619 | GAL 1.3B |
| Molecular Property Prediction | MoleculeNet | AUC | 0.581 | GAL 125M |
| Molecular Property Prediction | BBBP | ROC-AUC | 72.9 | Uni-Mol |
| Molecular Property Prediction | BBBP | ROC-AUC | 72.9 | Uni-Mol |
| Molecular Property Prediction | BBBP | ROC-AUC | 66.1 | GAL 120B |
| Molecular Property Prediction | BBBP | ROC-AUC | 60.4 | GAL 1.3B |
| Molecular Property Prediction | BBBP | ROC-AUC | 59.6 | GAL 30B |
| Molecular Property Prediction | BBBP | ROC-AUC | 53.5 | GAL 6.7B |
| Molecular Property Prediction | BBBP | ROC-AUC | 39.3 | GAL 125M |
| Molecular Property Prediction | HIV dataset | AUC | 0.808 | Uni-Mol |
| Molecular Property Prediction | HIV dataset | AUC | 0.759 | GAL 30B |
| Molecular Property Prediction | HIV dataset | AUC | 0.745 | GAL 120B |
| Molecular Property Prediction | HIV dataset | AUC | 0.724 | GAL 1.3B |
| Molecular Property Prediction | HIV dataset | AUC | 0.722 | GAL 6.7B |
| Molecular Property Prediction | HIV dataset | AUC | 0.702 | GAL 125M |
| Molecular Property Prediction | SIDER | ROC-AUC | 63.2 | GAL 120B |
| Molecular Property Prediction | SIDER | ROC-AUC | 61.3 | GAL 30B |
| Molecular Property Prediction | SIDER | ROC-AUC | 55.9 | GAL 125M |
| Molecular Property Prediction | SIDER | ROC-AUC | 55.9 | GAL 6.7B |
| Molecular Property Prediction | SIDER | ROC-AUC | 54 | GAL 1.3B |
| Molecular Property Prediction | Tox21 | ROC-AUC | 79.6 | Uni-Mol |
| Molecular Property Prediction | Tox21 | ROC-AUC | 79.6 | Uni-Mol |
| Molecular Property Prediction | Tox21 | ROC-AUC | 68.9 | GAL 120B |
| Molecular Property Prediction | Tox21 | ROC-AUC | 68.5 | GAL 30B |
| Molecular Property Prediction | Tox21 | ROC-AUC | 63.9 | GAL 6.7B |
| Molecular Property Prediction | Tox21 | ROC-AUC | 60.6 | GAL 1.3B |
| Molecular Property Prediction | Tox21 | ROC-AUC | 54.3 | GAL 125M |
| Molecular Property Prediction | BACE | ROC-AUC | 72.7 | GAL 30B |
| Molecular Property Prediction | BACE | ROC-AUC | 61.7 | GAL 120B |
| Molecular Property Prediction | BACE | ROC-AUC | 58.4 | GAL 6.7B |
| Molecular Property Prediction | BACE | ROC-AUC | 57.6 | GAL 1.3B |
| Molecular Property Prediction | BACE | ROC-AUC | 56.1 | GAL 125M |
| Bias Detection | StereoSet | ICAT Score | 65.6 | GAL 120B |
| Bias Detection | StereoSet | LMS | 75 | GAL 120B |
| Bias Detection | StereoSet | SS | 56.2 | GAL 120B |
| Bias Detection | StereoSet | ICAT Score | 60.8 | GPT-3 (text-davinci-002) |
| Bias Detection | StereoSet | LMS | 77.6 | GPT-3 (text-davinci-002) |
| Bias Detection | StereoSet | SS | 60.8 | GPT-3 (text-davinci-002) |
| Bias Detection | StereoSet | ICAT Score | 60 | OPT 175B |
| Bias Detection | StereoSet | LMS | 74.8 | OPT 175B |
| Bias Detection | StereoSet | SS | 59.9 | OPT 175B |
| Protein Structure Prediction | CASPSeq | Validation perplexity | 17.26 | GAL 120B |
| Protein Structure Prediction | CASPSeq | Validation perplexity | 17.27 | GAL 30B |
| Protein Structure Prediction | CASPSeq | Validation perplexity | 17.29 | GAL 6.7B |
| Protein Structure Prediction | CASPSeq | Validation perplexity | 17.58 | GAL 1.3B |
| Protein Structure Prediction | CASPSeq | Validation perplexity | 20.62 | GAL 125M |
| Protein Structure Prediction | UniProtSeq | Validation perplexity | 5.54 | GAL 120B |
| Protein Structure Prediction | UniProtSeq | Validation perplexity | 8.23 | GAL 30B |
| Protein Structure Prediction | UniProtSeq | Validation perplexity | 11.58 | GAL 6.7B |
| Protein Structure Prediction | UniProtSeq | Validation perplexity | 15.82 | GAL 1.3B |
| Protein Structure Prediction | UniProtSeq | Validation perplexity | 19.05 | GAL 125M |
| Protein Structure Prediction | PaenSeq | Validation perplexity | 3.14 | GAL 120B |
| Protein Structure Prediction | PaenSeq | Validation perplexity | 4.28 | GAL 30B |
| Protein Structure Prediction | PaenSeq | Validation perplexity | 7.76 | GAL 6.7B |
| Protein Structure Prediction | PaenSeq | Validation perplexity | 12.53 | GAL 1.3B |
| Protein Structure Prediction | PaenSeq | Validation perplexity | 16.35 | GAL 125M |
| Protein Structure Prediction | CASPSimSeq | Validation perplexity | 12.77 | GAL 120B |
| Protein Structure Prediction | CASPSimSeq | Validation perplexity | 15.42 | GAL 30B |
| Protein Structure Prediction | CASPSimSeq | Validation perplexity | 16.35 | GAL 6.7B |
| Protein Structure Prediction | CASPSimSeq | Validation perplexity | 17.04 | GAL 1.3B |
| Protein Structure Prediction | CASPSimSeq | Validation perplexity | 19.18 | GAL 125M |
| Mathematical Question Answering | MATH | Accuracy | 33.6 | Minerva 540B (5-shot) mCoT |
| Mathematical Question Answering | MATH | Parameters (Billions) | 540 | Minerva 540B (5-shot) mCoT |
| Mathematical Question Answering | MATH | Accuracy | 20.4 | GAL 120B (5-shot) mCoT |
| Mathematical Question Answering | MATH | Parameters (Billions) | 120 | GAL 120B (5-shot) mCoT |
| Mathematical Question Answering | MATH | Accuracy | 16.6 | GAL 120B <work> |
| Mathematical Question Answering | MATH | Parameters (Billions) | 120 | GAL 120B <work> |
| Mathematical Question Answering | MATH | Accuracy | 12.7 | GAL 30B (5-shot) mCoT |
| Mathematical Question Answering | MATH | Parameters (Billions) | 30 | GAL 30B (5-shot) mCoT |
| Mathematical Question Answering | MATH | Accuracy | 11.4 | GAL 30B <work> |
| Mathematical Question Answering | MATH | Parameters (Billions) | 30 | GAL 30B <work> |
| Mathematical Question Answering | MATH | Accuracy | 8.8 | PaLM 540B (5-shot) mCoT |
| Mathematical Question Answering | MATH | Parameters (Billions) | 540 | PaLM 540B (5-shot) mCoT |
| Mathematical Question Answering | MATH | Accuracy | 5.2 | GPT-3 175B (8-shot) |
| Mathematical Question Answering | MATH | Parameters (Billions) | 175 | GPT-3 175B (8-shot) |
| Multi-Task Learning | MML | Average (%) | 52.6 | GAL 120B (zero-shot) |
| Mathematical Reasoning | MATH | Accuracy | 33.6 | Minerva 540B (5-shot) mCoT |
| Mathematical Reasoning | MATH | Parameters (Billions) | 540 | Minerva 540B (5-shot) mCoT |
| Mathematical Reasoning | MATH | Accuracy | 20.4 | GAL 120B (5-shot) mCoT |
| Mathematical Reasoning | MATH | Parameters (Billions) | 120 | GAL 120B (5-shot) mCoT |
| Mathematical Reasoning | MATH | Accuracy | 16.6 | GAL 120B <work> |
| Mathematical Reasoning | MATH | Parameters (Billions) | 120 | GAL 120B <work> |
| Mathematical Reasoning | MATH | Accuracy | 12.7 | GAL 30B (5-shot) mCoT |
| Mathematical Reasoning | MATH | Parameters (Billions) | 30 | GAL 30B (5-shot) mCoT |
| Mathematical Reasoning | MATH | Accuracy | 11.4 | GAL 30B <work> |
| Mathematical Reasoning | MATH | Parameters (Billions) | 30 | GAL 30B <work> |
| Mathematical Reasoning | MATH | Accuracy | 8.8 | PaLM 540B (5-shot) mCoT |
| Mathematical Reasoning | MATH | Parameters (Billions) | 540 | PaLM 540B (5-shot) mCoT |
| Mathematical Reasoning | MATH | Accuracy | 5.2 | GPT-3 175B (8-shot) |
| Mathematical Reasoning | MATH | Parameters (Billions) | 175 | GPT-3 175B (8-shot) |
| Stereotypical Bias Analysis | CrowS-Pairs | Age | 69 | GAL 120B |
| Stereotypical Bias Analysis | CrowS-Pairs | Disability | 66.7 | GAL 120B |
| Stereotypical Bias Analysis | CrowS-Pairs | Gender | 51.9 | GAL 120B |
| Stereotypical Bias Analysis | CrowS-Pairs | Nationality | 51.6 | GAL 120B |
| Stereotypical Bias Analysis | CrowS-Pairs | Overall | 60.5 | GAL 120B |
| Stereotypical Bias Analysis | CrowS-Pairs | Physical Appearance | 58.7 | GAL 120B |
| Stereotypical Bias Analysis | CrowS-Pairs | Race/Color | 59.9 | GAL 120B |
| Stereotypical Bias Analysis | CrowS-Pairs | Religion | 51.9 | GAL 120B |
| Stereotypical Bias Analysis | CrowS-Pairs | Sexual Orientation | 77.4 | GAL 120B |
| Stereotypical Bias Analysis | CrowS-Pairs | Socioeconomic status | 65.7 | GAL 120B |
| Protein Function Prediction | UniProtSeq | ROUGE-L | 0.252 | GAL 120B |
| Protein Function Prediction | UniProtSeq | ROUGE-L | 0.186 | GAL 30B |
| Protein Function Prediction | UniProtSeq | ROUGE-L | 0.111 | GAL 6.7B |
| Protein Function Prediction | UniProtSeq | ROUGE-L | 0.079 | GAL 1.3B |
| Protein Function Prediction | UniProtSeq | ROUGE-L | 0.061 | GAL 125M |
| Protein Function Prediction | CASPSimSeq | ROUGE-L | 0.252 | GAL 120B |
| Protein Function Prediction | CASPSimSeq | ROUGE-L | 0.137 | GAL 30B |
| Protein Function Prediction | CASPSimSeq | ROUGE-L | 0.109 | GAL 6.7B |
| Protein Function Prediction | CASPSimSeq | ROUGE-L | 0.069 | GAL 1.3B |
| Protein Function Prediction | CASPSimSeq | ROUGE-L | 0.062 | GAL 125M |
| Protein Function Prediction | PaenSeq | ROUGE-L | 0.272 | GAL 120B |
| Protein Function Prediction | PaenSeq | ROUGE-L | 0.196 | GAL 30B |
| Protein Function Prediction | PaenSeq | ROUGE-L | 0.137 | GAL 6.7B |
| Protein Function Prediction | PaenSeq | ROUGE-L | 0.084 | GAL 1.3B |
| Protein Function Prediction | PaenSeq | ROUGE-L | 0.073 | GAL 125M |
| Atomistic Description | clintox | Molecules (M) | 2 | GAL 120B |
| Atomistic Description | clintox | ROC-AUC | 82.6 | GAL 120B |
| Atomistic Description | clintox | Molecules (M) | 2 | GAL 30B |
| Atomistic Description | clintox | ROC-AUC | 82.2 | GAL 30B |
| Atomistic Description | clintox | Molecules (M) | 2 | GAL 6.7B |
| Atomistic Description | clintox | ROC-AUC | 78.4 | GAL 6.7B |
| Atomistic Description | clintox | Molecules (M) | 2 | GAL 1.3B |
| Atomistic Description | clintox | ROC-AUC | 58.9 | GAL 1.3B |
| Atomistic Description | clintox | Molecules (M) | 2 | GAL 125M |
| Atomistic Description | clintox | ROC-AUC | 51.8 | GAL 125M |
| Atomistic Description | MoleculeNet | AUC | 0.77 | Uni-Mol |
| Atomistic Description | MoleculeNet | AUC | 0.69 | GAL 30B |
| Atomistic Description | MoleculeNet | AUC | 0.64 | GAL 6.7B |
| Atomistic Description | MoleculeNet | AUC | 0.619 | GAL 1.3B |
| Atomistic Description | MoleculeNet | AUC | 0.581 | GAL 125M |
| Atomistic Description | BBBP | ROC-AUC | 72.9 | Uni-Mol |
| Atomistic Description | BBBP | ROC-AUC | 72.9 | Uni-Mol |
| Atomistic Description | BBBP | ROC-AUC | 66.1 | GAL 120B |
| Atomistic Description | BBBP | ROC-AUC | 60.4 | GAL 1.3B |
| Atomistic Description | BBBP | ROC-AUC | 59.6 | GAL 30B |
| Atomistic Description | BBBP | ROC-AUC | 53.5 | GAL 6.7B |
| Atomistic Description | BBBP | ROC-AUC | 39.3 | GAL 125M |
| Atomistic Description | HIV dataset | AUC | 0.808 | Uni-Mol |
| Atomistic Description | HIV dataset | AUC | 0.759 | GAL 30B |
| Atomistic Description | HIV dataset | AUC | 0.745 | GAL 120B |
| Atomistic Description | HIV dataset | AUC | 0.724 | GAL 1.3B |
| Atomistic Description | HIV dataset | AUC | 0.722 | GAL 6.7B |
| Atomistic Description | HIV dataset | AUC | 0.702 | GAL 125M |
| Atomistic Description | SIDER | ROC-AUC | 63.2 | GAL 120B |
| Atomistic Description | SIDER | ROC-AUC | 61.3 | GAL 30B |
| Atomistic Description | SIDER | ROC-AUC | 55.9 | GAL 125M |
| Atomistic Description | SIDER | ROC-AUC | 55.9 | GAL 6.7B |
| Atomistic Description | SIDER | ROC-AUC | 54 | GAL 1.3B |
| Atomistic Description | Tox21 | ROC-AUC | 79.6 | Uni-Mol |
| Atomistic Description | Tox21 | ROC-AUC | 79.6 | Uni-Mol |
| Atomistic Description | Tox21 | ROC-AUC | 68.9 | GAL 120B |
| Atomistic Description | Tox21 | ROC-AUC | 68.5 | GAL 30B |
| Atomistic Description | Tox21 | ROC-AUC | 63.9 | GAL 6.7B |
| Atomistic Description | Tox21 | ROC-AUC | 60.6 | GAL 1.3B |
| Atomistic Description | Tox21 | ROC-AUC | 54.3 | GAL 125M |
| Atomistic Description | BACE | ROC-AUC | 72.7 | GAL 30B |
| Atomistic Description | BACE | ROC-AUC | 61.7 | GAL 120B |
| Atomistic Description | BACE | ROC-AUC | 58.4 | GAL 6.7B |
| Atomistic Description | BACE | ROC-AUC | 57.6 | GAL 1.3B |
| Atomistic Description | BACE | ROC-AUC | 56.1 | GAL 125M |
| Therapeutics Data Commons | tdcommons | TDC.BBB_Martins | 0.661 | Galactica-GAL-120B |
| Therapeutics Data Commons | tdcommons | TDC.BBB_Martins | 0.604 | Galactica-GAL-1.3B |
| Therapeutics Data Commons | tdcommons | TDC.BBB_Martins | 0.596 | Galactica-GAL-30B |
| Therapeutics Data Commons | tdcommons | TDC.BBB_Martins | 0.535 | Galactica-GAL-6.7B |
| Therapeutics Data Commons | tdcommons | TDC.BBB_Martins | 0.393 | Galactica-GAL-125M |