Tasks
SotA
Datasets
Papers
Methods
Submit
About
SotA
/
Methodology
/
Transfer Learning
/
MML
Transfer Learning on MML
Metric: Average (%) (higher is better)
Leaderboard
Dataset
Loading chart...
Results
Submit a result
Hide augmentations
Export CSV
#
Model
↕
Average (%)
▼
Augmentations
Paper
Date
↕
Code
1
GPT-4 o1(300b)
87
Yes
GPT-4o as the Gold Standard: A Scalable and Gene...
2024-10-03
-
2
Llama 3.1 (405B)
86.6
Yes
Llama 3 Meets MoE: Efficient Upcycling
2024-12-13
Code
3
Llama 3.1 (70B)
86
Yes
Llama 3 Meets MoE: Efficient Upcycling
2024-12-13
Code
4
Gemini Ultra (5-shot)
83.7
No
-
-
-
5
Claude 3 Sonnet (5-shot)
79
No
-
-
-
6
Qwen1.5 72B (5-shot)
77.5
No
-
-
-
7
Claude 3 Haiku (5-shot)
75.2
No
-
-
-
8
DBRX Instruct 132B (5-shot)
73.7
No
The Llama 3 Herd of Models
2024-07-31
Code
9
llama 2(65b)
73.5
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
10
Llama 3.1 8B (CoT)
73
Yes
The Llama 3 Herd of Models
2024-07-31
Code
11
Mixtral 8x7B (5-shot)
70.6
No
Mixtral of Experts
2024-01-08
Code
12
GPT-3.5 Turbo
70
Yes
GPT-4 Technical Report
2023-03-15
Code
13
LLaMA 65B (fine-tuned)
68.9
No
LLaMA: Open and Efficient Foundation Language Mo...
2023-02-27
Code
14
chatgpt/gpt3.5(20B)
67.5
No
Training Compute-Optimal Large Language Models
2022-03-29
Code
15
LLaMA 65B (5-shot)
63.4
No
LLaMA: Open and Efficient Foundation Language Mo...
2023-02-27
Code
16
LLaMA 2 34B (5-shot)
62.6
No
Llama 2: Open Foundation and Fine-Tuned Chat Mod...
2023-07-18
Code
17
Mistral 7B (5-shot)
62.5
Yes
Mixtral of Experts
2024-01-08
Code
18
Mistral 7B (5-shot)
60.1
No
Mistral 7B
2023-10-10
Code
19
GPT-3 Davinci 175B (CoT)
59.5
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
20
LLaMA 33B (5-shot)
57.8
No
LLaMA: Open and Efficient Foundation Language Mo...
2023-02-27
Code
21
Falcon 40B
57
No
The Falcon Series of Open Language Models
2023-11-28
-
22
Qwen 7B (5-shot)
56.7
No
-
-
-
23
LLaMA 2 13B (5-shot)
54.8
No
Llama 2: Open Foundation and Fine-Tuned Chat Mod...
2023-07-18
Code
24
Branch-Train-MiX 4x7B (sampling top-1 experts)
53.2
No
Branch-Train-MiX: Mixing Expert LLMs into a Mixt...
2024-03-12
Code
25
GAL 120B (zero-shot)
52.6
No
Galactica: A Large Language Model for Science
2022-11-16
Code
26
Atlas (5-shot)
47.9
No
Atlas: Few-shot Learning with Retrieval Augmente...
2022-08-05
Code
27
Flan-T5-XL 3B (CoT)
45.5
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
28
LLaMA 2 7B (5-shot)
45.3
No
Llama 2: Open Foundation and Fine-Tuned Chat Mod...
2023-07-18
Code
29
Flan-T5-Large 780M
45.1
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
30
GLM-130B
44.8
No
GLM-130B: An Open Bilingual Pre-trained Model
2022-10-05
Code
31
Flan-T5-Large 780M (CoT)
40.5
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
32
GPT-3 Davinci 175B (5-shot)
39.7
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
33
Bloomberg GPT 50B (5-shot)
39.2
No
BloombergGPT: A Large Language Model for Finance
2023-03-30
Code
34
UL2 20B (5-shot)
39.2
No
UL2: Unifying Language Learning Paradigms
2022-05-10
Code
35
BLOOM 176B (5-shot)
39.1
No
BloombergGPT: A Large Language Model for Finance
2023-03-30
Code
36
phi-1.5-web 1.3B
37.9
No
Textbooks Are All You Need II: phi-1.5 technical...
2023-09-11
Code
37
OPT 66B (5-shot)
36
No
BloombergGPT: A Large Language Model for Finance
2023-03-30
Code
38
Flan-T5-Base 250M
35.9
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
39
Flan-T5-Base 250M (CoT)
33.7
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
40
GPT-NeoX 20B (5-shot)
33.6
No
GPT-NeoX-20B: An Open-Source Autoregressive Lang...
2022-04-14
Code
41
RWKV v5 Eagle 7B
31
No
-
-
-
42
LLaMA7B-MiLe-Loss(5-shot)
29.68
No
MiLe Loss: a New Loss for Mitigating the Bias of...
2023-10-30
Code
43
Flan-T5-Small 80M
28.7
No
Scaling Instruction-Finetuned Language Models
2022-10-20
Code
44
Falcon 7B (5-shot)
28
No
The Falcon Series of Open Language Models
2023-11-28
-