GPT-4o-2024-08-06-128k

Reported on 3 benchmarks across 1 task · 1 paper

Note: results are matched by exact model name. Different papers may use the same name for different model variants.

Natural Language Processing3 results

Question AnsweringonPeerQA
AlignScore· 2023-03-15
0.1224
best: 0.1378 (GPT-3.5-Turbo-0613-16k)
GPT-4 Technical Report arXiv:2303.08774
Question AnsweringonPeerQA
Prometheus-2 Answer Correctness· 2023-03-15
3.4612
best: 3.0408 (GPT-3.5-Turbo-0613-16k)
GPT-4 Technical Report arXiv:2303.08774
Question AnsweringonPeerQA
Rouge-L· 2023-03-15
0.2266
best: 0.2414 (GPT-3.5-Turbo-0613-16k)
GPT-4 Technical Report arXiv:2303.08774