Llama-3-IT-8B-32k

Reported on 4 benchmarks across 2 tasks · 1 paper

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing4 results

Question AnsweringonPeerQA
AlignScore· 2024-07-31
0.1016
best: 0.1378 (GPT-3.5-Turbo-0613-16k)
The Llama 3 Herd of Models arXiv:2407.21783
Question AnsweringonPeerQA
Prometheus-2 Answer Correctness· 2024-07-31
3.1673
best: 3.0408 (GPT-3.5-Turbo-0613-16k)
The Llama 3 Herd of Models arXiv:2407.21783
Question AnsweringonPeerQA
Rouge-L· 2024-07-31
0.2286
best: 0.2414 (GPT-3.5-Turbo-0613-16k)
The Llama 3 Herd of Models arXiv:2407.21783
answerability predictiononPeerQA
Macro F1· 2024-07-31
0.2881
best: 0.4703 (Mistral-IT-v02-7B-32k)
The Llama 3 Herd of Models arXiv:2407.21783