Anthropic/claude-3-7-sonnet
Reported on 2 benchmarks across 1 task
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Natural Language Processing2 results
- 74.23best: 92.52 (OpenAI/o3-2025-01-31-high)
- 82.3best: 94.01 (Riple/Saanvi-v0.5-DeepAnalysis)