Claude3.5-Sonnet
Reported on 1 benchmark across 1 task
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Knowledge Base1 result
- 16best: 94.4 (Xolver)
Reported on 1 benchmark across 1 task
Note: results are matched by exact model name. Different papers may use the same name for different model variants.