TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/Galactica: A Large Language Model for Science

Galactica: A Large Language Model for Science

Ross Taylor, Marcin Kardas, Guillem Cucurull, Thomas Scialom, Anthony Hartshorn, Elvis Saravia, Andrew Poulton, Viktor Kerkez, Robert Stojnic

2022-11-16Molecular Property PredictionQuestion AnsweringMathematical ReasoningProtein AnnotationMathMath Word Problem SolvingMulti-task Language UnderstandingProtein Structure PredictionIUPAC Name PredictionStereotypical Bias AnalysisCommon Sense ReasoningLarge Language ModelClassificationMMLUBias DetectionCitation PredictionKnowledge ProbingWord Sense DisambiguationLanguage ModellingMultiple Choice Question Answering (MCQA)Protein Function PredictionTDC ADMET Benchmarking Group
PaperPDFCode(official)

Abstract

Information overload is a major obstacle to scientific progress. The explosive growth in scientific literature and data has made it ever harder to discover useful insights in a large mass of information. Today scientific knowledge is accessed through search engines, but they are unable to organize scientific knowledge alone. In this paper we introduce Galactica: a large language model that can store, combine and reason about scientific knowledge. We train on a large scientific corpus of papers, reference material, knowledge bases and many other sources. We outperform existing models on a range of scientific tasks. On technical knowledge probes such as LaTeX equations, Galactica outperforms the latest GPT-3 by 68.2% versus 49.0%. Galactica also performs well on reasoning, outperforming Chinchilla on mathematical MMLU by 41.3% to 35.7%, and PaLM 540B on MATH with a score of 20.4% versus 8.8%. It also sets a new state-of-the-art on downstream tasks such as PubMedQA and MedMCQA dev of 77.6% and 52.9%. And despite not being trained on a general corpus, Galactica outperforms BLOOM and OPT-175B on BIG-bench. We believe these results demonstrate the potential for language models as a new interface for science. We open source the model for the benefit of the scientific community.

Results

TaskDatasetMetricValueModel
Transfer LearningMMLAverage (%)52.6GAL 120B (zero-shot)
Question AnsweringPubMedQAAccuracy77.6GAL 120B (zero-shot)
Question AnsweringPubMedQAAccuracy73.6BLOOM (zero-shot)
Question AnsweringPubMedQAAccuracy70.2OPT (zero-shot)
Question AnsweringMedQAAccuracy44.4GAL 120B (zero-shot)
Question AnsweringMedQAAccuracy23.3BLOOM (few-shot, k=5)
Question AnsweringMedQAAccuracy22.8OPT (few-shot, k=5)
Question AnsweringBioASQAccuracy94.3GAL 120B (zero-shot)
Question AnsweringBioASQAccuracy91.4BLOOM (zero-shot)
Question AnsweringBioASQAccuracy81.4OPT (zero-shot)
Question AnsweringTruthfulQAMC10.26GAL 120B
Question AnsweringTruthfulQAMC10.24GAL 30B
Question AnsweringTruthfulQAMC10.21OPT 175B
Question AnsweringTruthfulQAMC10.19GAL 125M
Question AnsweringTruthfulQAMC10.19GAL 1.3B
Question AnsweringTruthfulQAMC10.19GAL 6.7B
Question AnsweringMMLU (Econometrics)Accuracy43Gopher (few-shot, k=5)
Question AnsweringMMLU (Econometrics)Accuracy42.1GAL 120B (zero-shot)
Question AnsweringMMLU (Econometrics)Accuracy38.6Chinchilla (few-shot, k=5)
Question AnsweringMMLU (Econometrics)Accuracy23.7BLOOM (few-shot, k=5)
Question AnsweringMMLU (Econometrics)Accuracy21OPT (few-shot, k=5)
Question AnsweringMMLU (College Biology)Accuracy79.9Chinchilla (few-shot, k=5)
Question AnsweringMMLU (College Biology)Accuracy70.8Gopher (few-shot, k=5)
Question AnsweringMMLU (College Biology)Accuracy68.8GAL 120B (zero-shot)
Question AnsweringMMLU (College Biology)Accuracy30.6OPT (few-shot, k=5)
Question AnsweringMMLU (College Biology)Accuracy28.5BLOOM (few-shot, k=5)
Question AnsweringMMLU (Machine Learning)Accuracy41.1Chinchilla (few-shot, k=5)
Question AnsweringMMLU (Machine Learning)Accuracy38.4GAL 120B (zero-shot)
Question AnsweringMMLU (Machine Learning)Accuracy28.6OPT (few-shot, k=5)
Question AnsweringMMLU (Machine Learning)Accuracy25BLOOM (few-shot, k=5)
Question AnsweringMMLU (High School Physics)Accuracy36.4Chinchilla (few-shot, k=5)
Question AnsweringMMLU (High School Physics)Accuracy33.8GAL 120B (zero-shot)
Question AnsweringMMLU (High School Physics)Accuracy29.8OPT (few-shot, k=5)
Question AnsweringMMLU (High School Physics)Accuracy25.2BLOOM (few-shot, k=5)
Question AnsweringMMLU (Medical Genetics)Accuracy70GAL 30B (zero-shot)
Question AnsweringMMLU (Medical Genetics)Accuracy69Chinchilla (few-shot, k=5)
Question AnsweringMMLU (Medical Genetics)Accuracy68GAL 120B (zero-shot)
Question AnsweringMMLU (Medical Genetics)Accuracy36BLOOM (few-shot, k=5)
Question AnsweringMMLU (Medical Genetics)Accuracy35OPT (few-shot, k=5)
Question AnsweringMMLU (High School Computer Science)Accuracy70GAL 120B (zero-shot)
Question AnsweringMMLU (High School Computer Science)Accuracy58Chinchilla (few-shot, k=5)
Question AnsweringMMLU (High School Computer Science)Accuracy54Gopher (few-shot, k=5)
Question AnsweringMMLU (High School Computer Science)Accuracy30OPT (few-shot, k=5)
Question AnsweringMMLU (High School Computer Science)Accuracy25BLOOM (few-shot, k=5)
Question AnsweringMMLU (College Chemistry)Accuracy51Chinchilla (few-shot, k=5)
Question AnsweringMMLU (College Chemistry)Accuracy46GAL 120B (zero-shot)
Question AnsweringMMLU (College Chemistry)Accuracy45Gopher (few-shot, k=5)
Question AnsweringMMLU (College Chemistry)Accuracy30OPT (few-shot, k=5)
Question AnsweringMMLU (College Chemistry)Accuracy19BLOOM (few-shot, k=5)
Question AnsweringMMLU (College Computer Science)Accuracy51Chinchilla (few-shot, k=5)
Question AnsweringMMLU (College Computer Science)Accuracy49GAL 120B (zero-shot)
Question AnsweringMMLU (College Computer Science)Accuracy17OPT (few-shot, k=5)
Question AnsweringMMLU (College Computer Science)Accuracy6BLOOM (few-shot, k=5)
Question AnsweringMMLU (Astronomy)Accuracy73Chinchilla (few-shot, k=5)
Question AnsweringMMLU (Astronomy)Accuracy65.8Gopher (few-shot, k=5)
Question AnsweringMMLU (Astronomy)Accuracy65.1GAL 120B (zero-shot)
Question AnsweringMMLU (Astronomy)Accuracy25.7BLOOM (few-shot, k=5)
Question AnsweringMMLU (Astronomy)Accuracy23OPT (few-shot, k=5)
Question AnsweringMMLU (Electrical Engineer)Accuracy62.8GAL 120B (zero-shot)
Question AnsweringMMLU (Electrical Engineer)Accuracy62.1Chinchilla (few-shot, k=5)
Question AnsweringMMLU (Electrical Engineer)Accuracy60Gopher (few-shot, k=5)
Question AnsweringMMLU (Electrical Engineer)Accuracy36.6OPT (few-shot, k=5)
Question AnsweringMMLU (Electrical Engineer)Accuracy32.4BLOOM (few-shot, k=5)
Question AnsweringMMLU (Formal Logic)Accuracy35.7Gopher (few-shot, k=5)
Question AnsweringMMLU (Formal Logic)Accuracy33.3Chinchilla (few-shot, k=5)
Question AnsweringMMLU (Formal Logic)Accuracy32.5GAL 120B (zero-shot)
Question AnsweringMMLU (Formal Logic)Accuracy29.4OPT (few-shot, k=5)
Question AnsweringMMLU (Formal Logic)Accuracy26.2BLOOM (few-shot, k=5)
Question AnsweringMMLU (High School Biology)Accuracy80.3Chinchilla (few-shot, k=5)
Question AnsweringMMLU (High School Biology)Accuracy71.3Gopher (few-shot, k=5)
Question AnsweringMMLU (High School Biology)Accuracy69.4GAL 120B (zero-shot)
Question AnsweringMMLU (High School Biology)Accuracy29.4BLOOM (few-shot, k=5)
Question AnsweringMMLU (High School Biology)Accuracy27.7OPT (few-shot, k=5)
Question AnsweringMMLU (High School Mathematics)Accuracy32.6GAL 120B (zero-shot)
Question AnsweringMMLU (High School Mathematics)Accuracy31.9Chinchilla (few-shot, k=5)
Question AnsweringMMLU (High School Mathematics)Accuracy27BLOOM (few-shot, k=5)
Question AnsweringMMLU (High School Mathematics)Accuracy24.4OPT (few-shot, k=5)
Question AnsweringMMLU (High School Mathematics)Accuracy23.7Gopher (few-shot, k=5)
Question AnsweringMedMCQADev Set (Acc-%)0.529GAL 120B (zero-shot)
Question AnsweringMedMCQADev Set (Acc-%)0.325BLOOM (few-shot, k=5)
Question AnsweringMedMCQADev Set (Acc-%)0.296OPT (few-shot, k=5)
Question AnsweringMMLU (High School Chemistry)Accuracy58.1Chinchilla (few-shot, k=5)
Question AnsweringMMLU (High School Chemistry)Accuracy47.8GAL 120B (zero-shot)
Question AnsweringMMLU (High School Chemistry)Accuracy23.2BLOOM (few-shot, k=5)
Question AnsweringMMLU (High School Chemistry)Accuracy21.7OPT (few-shot, k=5)
Question AnsweringMMLU (Elementary Mathematics)Accuracy41.5Chinchilla (few-shot, k=5)
Question AnsweringMMLU (Elementary Mathematics)Accuracy38.1GAL 120B (zero-shot)
Question AnsweringMMLU (Elementary Mathematics)Accuracy33.6Gopher (few-shot, k=5)
Question AnsweringMMLU (Elementary Mathematics)Accuracy27.6BLOOM (few-shot, k=5)
Question AnsweringMMLU (Elementary Mathematics)Accuracy25.7OPT (few-shot, k=5)
Question AnsweringMMLU (Abstract Algebra)Accuracy33.3GAL 30B (zero-shot)
Question AnsweringMMLU (Abstract Algebra)Accuracy31Chinchilla (few-shot, k=5)
Question AnsweringMMLU (Abstract Algebra)Accuracy27GAL 120B (zero-shot)
Question AnsweringMMLU (Abstract Algebra)Accuracy25Gopher (few-shot, k=5)
Question AnsweringMMLU (Abstract Algebra)Accuracy21OPT (few-shot, k=5)
Question AnsweringMMLU (High School Statistics)Accuracy58.8Chinchilla (few-shot, k=5)
Question AnsweringMMLU (High School Statistics)Accuracy50Gopher (few-shot, k=5)
Question AnsweringMMLU (High School Statistics)Accuracy43.5OPT (few-shot, k=5)
Question AnsweringMMLU (High School Statistics)Accuracy41.2GAL 120B (zero-shot)
Question AnsweringMMLU (High School Statistics)Accuracy19.4BLOOM (few-shot, k=5)
Question AnsweringMMLU (College Physics)Accuracy46.1Chinchilla (few-shot, k=5)
Question AnsweringMMLU (College Physics)Accuracy42.2GAL 120B (zero-shot)
Question AnsweringMMLU (College Physics)Accuracy34.3Gopher (few-shot, k=5)
Question AnsweringMMLU (College Physics)Accuracy21.6OPT (few-shot, k=5)
Question AnsweringMMLU (College Physics)Accuracy18.6BLOOM (few-shot, k=5)
Question AnsweringMMLU (College Mathematics)Accuracy43GAL 120B (zero-shot)
Question AnsweringMMLU (College Mathematics)Accuracy37Gopher (few-shot, k=5)
Question AnsweringMMLU (College Mathematics)Accuracy33OPT (few-shot, k=5)
Question AnsweringMMLU (College Mathematics)Accuracy32Chinchilla (few-shot, k=5)
Question AnsweringMMLU (College Mathematics)Accuracy25BLOOM (few-shot, k=5)
Question AnsweringMATHAccuracy33.6Minerva 540B (5-shot) mCoT
Question AnsweringMATHParameters (Billions)540Minerva 540B (5-shot) mCoT
Question AnsweringMATHAccuracy20.4GAL 120B (5-shot) mCoT
Question AnsweringMATHParameters (Billions)120GAL 120B (5-shot) mCoT
Question AnsweringMATHAccuracy16.6GAL 120B <work>
Question AnsweringMATHParameters (Billions)120GAL 120B <work>
Question AnsweringMATHAccuracy12.7GAL 30B (5-shot) mCoT
Question AnsweringMATHParameters (Billions)30GAL 30B (5-shot) mCoT
Question AnsweringMATHAccuracy11.4GAL 30B <work>
Question AnsweringMATHParameters (Billions)30GAL 30B <work>
Question AnsweringMATHAccuracy8.8PaLM 540B (5-shot) mCoT
Question AnsweringMATHParameters (Billions)540PaLM 540B (5-shot) mCoT
Question AnsweringMATHAccuracy5.2GPT-3 175B (8-shot)
Question AnsweringMATHParameters (Billions)175GPT-3 175B (8-shot)
Common Sense ReasoningARC (Challenge)Accuracy67.9GAL 120B (zero-shot)
Common Sense ReasoningARC (Challenge)Accuracy51.4GPT-3 (zero-shot)
Common Sense ReasoningARC (Challenge)Accuracy32.9BLOOM (few-shot, k=5)
Common Sense ReasoningARC (Challenge)Accuracy31.1OPT (few-shot, k=5)
Common Sense ReasoningARC (Easy)Accuracy83.8GAL 120B (0-shot)
Common Sense ReasoningARC (Easy)Accuracy68.8GPT-3 (zero-shot)
Common Sense ReasoningARC (Easy)Accuracy40.7BLOOM (5-shot)
Common Sense ReasoningARC (Easy)Accuracy37.4OPT (5-shot)
Word Sense DisambiguationBIG-bench (Anachronisms)Accuracy49.1OPT 175B
Word Sense DisambiguationBIG-bench (Anachronisms)Accuracy48.7GAL 120B (few-shot, k=5)
Word Sense DisambiguationBIG-bench (Anachronisms)Accuracy47GAL 30B (few-shot, k=5)
Word Sense DisambiguationBIG-bench (Anachronisms)Accuracy1.3BLOOM 176B
Drug DiscoverytdcommonsTDC.BBB_Martins0.661Galactica-GAL-120B
Drug DiscoverytdcommonsTDC.BBB_Martins0.604Galactica-GAL-1.3B
Drug DiscoverytdcommonsTDC.BBB_Martins0.596Galactica-GAL-30B
Drug DiscoverytdcommonsTDC.BBB_Martins0.535Galactica-GAL-6.7B
Drug DiscoverytdcommonsTDC.BBB_Martins0.393Galactica-GAL-125M
Math Word Problem SolvingMATHAccuracy33.6Minerva 540B (5-shot) mCoT
Math Word Problem SolvingMATHParameters (Billions)540Minerva 540B (5-shot) mCoT
Math Word Problem SolvingMATHAccuracy20.4GAL 120B (5-shot) mCoT
Math Word Problem SolvingMATHParameters (Billions)120GAL 120B (5-shot) mCoT
Math Word Problem SolvingMATHAccuracy16.6GAL 120B <work>
Math Word Problem SolvingMATHParameters (Billions)120GAL 120B <work>
Math Word Problem SolvingMATHAccuracy12.7GAL 30B (5-shot) mCoT
Math Word Problem SolvingMATHParameters (Billions)30GAL 30B (5-shot) mCoT
Math Word Problem SolvingMATHAccuracy11.4GAL 30B <work>
Math Word Problem SolvingMATHParameters (Billions)30GAL 30B <work>
Math Word Problem SolvingMATHAccuracy8.8PaLM 540B (5-shot) mCoT
Math Word Problem SolvingMATHParameters (Billions)540PaLM 540B (5-shot) mCoT
Math Word Problem SolvingMATHAccuracy5.2GPT-3 175B (8-shot)
Math Word Problem SolvingMATHParameters (Billions)175GPT-3 175B (8-shot)
Molecular Property PredictionclintoxMolecules (M)2GAL 120B
Molecular Property PredictionclintoxROC-AUC82.6GAL 120B
Molecular Property PredictionclintoxMolecules (M)2GAL 30B
Molecular Property PredictionclintoxROC-AUC82.2GAL 30B
Molecular Property PredictionclintoxMolecules (M)2GAL 6.7B
Molecular Property PredictionclintoxROC-AUC78.4GAL 6.7B
Molecular Property PredictionclintoxMolecules (M)2GAL 1.3B
Molecular Property PredictionclintoxROC-AUC58.9GAL 1.3B
Molecular Property PredictionclintoxMolecules (M)2GAL 125M
Molecular Property PredictionclintoxROC-AUC51.8GAL 125M
Molecular Property PredictionMoleculeNetAUC0.77Uni-Mol
Molecular Property PredictionMoleculeNetAUC0.69GAL 30B
Molecular Property PredictionMoleculeNetAUC0.64GAL 6.7B
Molecular Property PredictionMoleculeNetAUC0.619GAL 1.3B
Molecular Property PredictionMoleculeNetAUC0.581GAL 125M
Molecular Property PredictionBBBPROC-AUC72.9Uni-Mol
Molecular Property PredictionBBBPROC-AUC72.9Uni-Mol
Molecular Property PredictionBBBPROC-AUC66.1GAL 120B
Molecular Property PredictionBBBPROC-AUC60.4GAL 1.3B
Molecular Property PredictionBBBPROC-AUC59.6GAL 30B
Molecular Property PredictionBBBPROC-AUC53.5GAL 6.7B
Molecular Property PredictionBBBPROC-AUC39.3GAL 125M
Molecular Property PredictionHIV datasetAUC0.808Uni-Mol
Molecular Property PredictionHIV datasetAUC0.759GAL 30B
Molecular Property PredictionHIV datasetAUC0.745GAL 120B
Molecular Property PredictionHIV datasetAUC0.724GAL 1.3B
Molecular Property PredictionHIV datasetAUC0.722GAL 6.7B
Molecular Property PredictionHIV datasetAUC0.702GAL 125M
Molecular Property PredictionSIDERROC-AUC63.2GAL 120B
Molecular Property PredictionSIDERROC-AUC61.3GAL 30B
Molecular Property PredictionSIDERROC-AUC55.9GAL 125M
Molecular Property PredictionSIDERROC-AUC55.9GAL 6.7B
Molecular Property PredictionSIDERROC-AUC54GAL 1.3B
Molecular Property PredictionTox21ROC-AUC79.6Uni-Mol
Molecular Property PredictionTox21ROC-AUC79.6Uni-Mol
Molecular Property PredictionTox21ROC-AUC68.9GAL 120B
Molecular Property PredictionTox21ROC-AUC68.5GAL 30B
Molecular Property PredictionTox21ROC-AUC63.9GAL 6.7B
Molecular Property PredictionTox21ROC-AUC60.6GAL 1.3B
Molecular Property PredictionTox21ROC-AUC54.3GAL 125M
Molecular Property PredictionBACEROC-AUC72.7GAL 30B
Molecular Property PredictionBACEROC-AUC61.7GAL 120B
Molecular Property PredictionBACEROC-AUC58.4GAL 6.7B
Molecular Property PredictionBACEROC-AUC57.6GAL 1.3B
Molecular Property PredictionBACEROC-AUC56.1GAL 125M
Bias DetectionStereoSetICAT Score65.6GAL 120B
Bias DetectionStereoSetLMS75GAL 120B
Bias DetectionStereoSetSS56.2GAL 120B
Bias DetectionStereoSetICAT Score60.8GPT-3 (text-davinci-002)
Bias DetectionStereoSetLMS77.6GPT-3 (text-davinci-002)
Bias DetectionStereoSetSS60.8GPT-3 (text-davinci-002)
Bias DetectionStereoSetICAT Score60OPT 175B
Bias DetectionStereoSetLMS74.8OPT 175B
Bias DetectionStereoSetSS59.9OPT 175B
Protein Structure PredictionCASPSeqValidation perplexity17.26GAL 120B
Protein Structure PredictionCASPSeqValidation perplexity17.27GAL 30B
Protein Structure PredictionCASPSeqValidation perplexity17.29GAL 6.7B
Protein Structure PredictionCASPSeqValidation perplexity17.58GAL 1.3B
Protein Structure PredictionCASPSeqValidation perplexity20.62GAL 125M
Protein Structure PredictionUniProtSeqValidation perplexity5.54GAL 120B
Protein Structure PredictionUniProtSeqValidation perplexity8.23GAL 30B
Protein Structure PredictionUniProtSeqValidation perplexity11.58GAL 6.7B
Protein Structure PredictionUniProtSeqValidation perplexity15.82GAL 1.3B
Protein Structure PredictionUniProtSeqValidation perplexity19.05GAL 125M
Protein Structure PredictionPaenSeqValidation perplexity3.14GAL 120B
Protein Structure PredictionPaenSeqValidation perplexity4.28GAL 30B
Protein Structure PredictionPaenSeqValidation perplexity7.76GAL 6.7B
Protein Structure PredictionPaenSeqValidation perplexity12.53GAL 1.3B
Protein Structure PredictionPaenSeqValidation perplexity16.35GAL 125M
Protein Structure PredictionCASPSimSeqValidation perplexity12.77GAL 120B
Protein Structure PredictionCASPSimSeqValidation perplexity15.42GAL 30B
Protein Structure PredictionCASPSimSeqValidation perplexity16.35GAL 6.7B
Protein Structure PredictionCASPSimSeqValidation perplexity17.04GAL 1.3B
Protein Structure PredictionCASPSimSeqValidation perplexity19.18GAL 125M
Mathematical Question AnsweringMATHAccuracy33.6Minerva 540B (5-shot) mCoT
Mathematical Question AnsweringMATHParameters (Billions)540Minerva 540B (5-shot) mCoT
Mathematical Question AnsweringMATHAccuracy20.4GAL 120B (5-shot) mCoT
Mathematical Question AnsweringMATHParameters (Billions)120GAL 120B (5-shot) mCoT
Mathematical Question AnsweringMATHAccuracy16.6GAL 120B <work>
Mathematical Question AnsweringMATHParameters (Billions)120GAL 120B <work>
Mathematical Question AnsweringMATHAccuracy12.7GAL 30B (5-shot) mCoT
Mathematical Question AnsweringMATHParameters (Billions)30GAL 30B (5-shot) mCoT
Mathematical Question AnsweringMATHAccuracy11.4GAL 30B <work>
Mathematical Question AnsweringMATHParameters (Billions)30GAL 30B <work>
Mathematical Question AnsweringMATHAccuracy8.8PaLM 540B (5-shot) mCoT
Mathematical Question AnsweringMATHParameters (Billions)540PaLM 540B (5-shot) mCoT
Mathematical Question AnsweringMATHAccuracy5.2GPT-3 175B (8-shot)
Mathematical Question AnsweringMATHParameters (Billions)175GPT-3 175B (8-shot)
Multi-Task LearningMMLAverage (%)52.6GAL 120B (zero-shot)
Mathematical ReasoningMATHAccuracy33.6Minerva 540B (5-shot) mCoT
Mathematical ReasoningMATHParameters (Billions)540Minerva 540B (5-shot) mCoT
Mathematical ReasoningMATHAccuracy20.4GAL 120B (5-shot) mCoT
Mathematical ReasoningMATHParameters (Billions)120GAL 120B (5-shot) mCoT
Mathematical ReasoningMATHAccuracy16.6GAL 120B <work>
Mathematical ReasoningMATHParameters (Billions)120GAL 120B <work>
Mathematical ReasoningMATHAccuracy12.7GAL 30B (5-shot) mCoT
Mathematical ReasoningMATHParameters (Billions)30GAL 30B (5-shot) mCoT
Mathematical ReasoningMATHAccuracy11.4GAL 30B <work>
Mathematical ReasoningMATHParameters (Billions)30GAL 30B <work>
Mathematical ReasoningMATHAccuracy8.8PaLM 540B (5-shot) mCoT
Mathematical ReasoningMATHParameters (Billions)540PaLM 540B (5-shot) mCoT
Mathematical ReasoningMATHAccuracy5.2GPT-3 175B (8-shot)
Mathematical ReasoningMATHParameters (Billions)175GPT-3 175B (8-shot)
Stereotypical Bias AnalysisCrowS-PairsAge69GAL 120B
Stereotypical Bias AnalysisCrowS-PairsDisability66.7GAL 120B
Stereotypical Bias AnalysisCrowS-PairsGender51.9GAL 120B
Stereotypical Bias AnalysisCrowS-PairsNationality51.6GAL 120B
Stereotypical Bias AnalysisCrowS-PairsOverall60.5GAL 120B
Stereotypical Bias AnalysisCrowS-PairsPhysical Appearance58.7GAL 120B
Stereotypical Bias AnalysisCrowS-PairsRace/Color59.9GAL 120B
Stereotypical Bias AnalysisCrowS-PairsReligion51.9GAL 120B
Stereotypical Bias AnalysisCrowS-PairsSexual Orientation77.4GAL 120B
Stereotypical Bias AnalysisCrowS-PairsSocioeconomic status65.7GAL 120B
Protein Function PredictionUniProtSeqROUGE-L0.252GAL 120B
Protein Function PredictionUniProtSeqROUGE-L0.186GAL 30B
Protein Function PredictionUniProtSeqROUGE-L0.111GAL 6.7B
Protein Function PredictionUniProtSeqROUGE-L0.079GAL 1.3B
Protein Function PredictionUniProtSeqROUGE-L0.061GAL 125M
Protein Function PredictionCASPSimSeqROUGE-L0.252GAL 120B
Protein Function PredictionCASPSimSeqROUGE-L0.137GAL 30B
Protein Function PredictionCASPSimSeqROUGE-L0.109GAL 6.7B
Protein Function PredictionCASPSimSeqROUGE-L0.069GAL 1.3B
Protein Function PredictionCASPSimSeqROUGE-L0.062GAL 125M
Protein Function PredictionPaenSeqROUGE-L0.272GAL 120B
Protein Function PredictionPaenSeqROUGE-L0.196GAL 30B
Protein Function PredictionPaenSeqROUGE-L0.137GAL 6.7B
Protein Function PredictionPaenSeqROUGE-L0.084GAL 1.3B
Protein Function PredictionPaenSeqROUGE-L0.073GAL 125M
Atomistic DescriptionclintoxMolecules (M)2GAL 120B
Atomistic DescriptionclintoxROC-AUC82.6GAL 120B
Atomistic DescriptionclintoxMolecules (M)2GAL 30B
Atomistic DescriptionclintoxROC-AUC82.2GAL 30B
Atomistic DescriptionclintoxMolecules (M)2GAL 6.7B
Atomistic DescriptionclintoxROC-AUC78.4GAL 6.7B
Atomistic DescriptionclintoxMolecules (M)2GAL 1.3B
Atomistic DescriptionclintoxROC-AUC58.9GAL 1.3B
Atomistic DescriptionclintoxMolecules (M)2GAL 125M
Atomistic DescriptionclintoxROC-AUC51.8GAL 125M
Atomistic DescriptionMoleculeNetAUC0.77Uni-Mol
Atomistic DescriptionMoleculeNetAUC0.69GAL 30B
Atomistic DescriptionMoleculeNetAUC0.64GAL 6.7B
Atomistic DescriptionMoleculeNetAUC0.619GAL 1.3B
Atomistic DescriptionMoleculeNetAUC0.581GAL 125M
Atomistic DescriptionBBBPROC-AUC72.9Uni-Mol
Atomistic DescriptionBBBPROC-AUC72.9Uni-Mol
Atomistic DescriptionBBBPROC-AUC66.1GAL 120B
Atomistic DescriptionBBBPROC-AUC60.4GAL 1.3B
Atomistic DescriptionBBBPROC-AUC59.6GAL 30B
Atomistic DescriptionBBBPROC-AUC53.5GAL 6.7B
Atomistic DescriptionBBBPROC-AUC39.3GAL 125M
Atomistic DescriptionHIV datasetAUC0.808Uni-Mol
Atomistic DescriptionHIV datasetAUC0.759GAL 30B
Atomistic DescriptionHIV datasetAUC0.745GAL 120B
Atomistic DescriptionHIV datasetAUC0.724GAL 1.3B
Atomistic DescriptionHIV datasetAUC0.722GAL 6.7B
Atomistic DescriptionHIV datasetAUC0.702GAL 125M
Atomistic DescriptionSIDERROC-AUC63.2GAL 120B
Atomistic DescriptionSIDERROC-AUC61.3GAL 30B
Atomistic DescriptionSIDERROC-AUC55.9GAL 125M
Atomistic DescriptionSIDERROC-AUC55.9GAL 6.7B
Atomistic DescriptionSIDERROC-AUC54GAL 1.3B
Atomistic DescriptionTox21ROC-AUC79.6Uni-Mol
Atomistic DescriptionTox21ROC-AUC79.6Uni-Mol
Atomistic DescriptionTox21ROC-AUC68.9GAL 120B
Atomistic DescriptionTox21ROC-AUC68.5GAL 30B
Atomistic DescriptionTox21ROC-AUC63.9GAL 6.7B
Atomistic DescriptionTox21ROC-AUC60.6GAL 1.3B
Atomistic DescriptionTox21ROC-AUC54.3GAL 125M
Atomistic DescriptionBACEROC-AUC72.7GAL 30B
Atomistic DescriptionBACEROC-AUC61.7GAL 120B
Atomistic DescriptionBACEROC-AUC58.4GAL 6.7B
Atomistic DescriptionBACEROC-AUC57.6GAL 1.3B
Atomistic DescriptionBACEROC-AUC56.1GAL 125M
Therapeutics Data CommonstdcommonsTDC.BBB_Martins0.661Galactica-GAL-120B
Therapeutics Data CommonstdcommonsTDC.BBB_Martins0.604Galactica-GAL-1.3B
Therapeutics Data CommonstdcommonsTDC.BBB_Martins0.596Galactica-GAL-30B
Therapeutics Data CommonstdcommonsTDC.BBB_Martins0.535Galactica-GAL-6.7B
Therapeutics Data CommonstdcommonsTDC.BBB_Martins0.393Galactica-GAL-125M

Related Papers

Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21DENSE: Longitudinal Progress Note Generation with Temporal Modeling of Heterogeneous Clinical Notes Across Hospital Visits2025-07-18From Roots to Rewards: Dynamic Tree Reasoning with RL2025-07-17Enter the Mind Palace: Reasoning and Planning for Long-term Active Embodied Question Answering2025-07-17Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It2025-07-17City-VLM: Towards Multidomain Perception Scene Understanding via Multimodal Incomplete Learning2025-07-17VAR-MATH: Probing True Mathematical Reasoning in Large Language Models via Symbolic Multi-Instance Benchmarks2025-07-17QuestA: Expanding Reasoning Capacity in LLMs via Question Augmentation2025-07-17