GPT-4o-2024-08-06-128k
Reported on 3 benchmarks across 1 task · 1 paper
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Natural Language Processing3 results
- AlignScore· 2023-03-150.1224best: 0.1378 (GPT-3.5-Turbo-0613-16k)
- Prometheus-2 Answer Correctness· 2023-03-153.4612best: 3.0408 (GPT-3.5-Turbo-0613-16k)
- Rouge-L· 2023-03-150.2266best: 0.2414 (GPT-3.5-Turbo-0613-16k)