Claude 3 Opus (5-shot)
Reported on 2 benchmarks across 2 tasks
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Natural Language Processing2 results
- Accuracy75.8best: 81.6 (Meditron-70B (CoT + SC))
- Accuracy88.5best: 96.1 (ST-MoE-32B 269B (fine-tuned))