MID
Reported on 3 benchmarks across 2 tasks · 1 paper · 3 SOTA
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Reasoning3 results
- Kendall's Tau-b· 2022-05-25SOTA37.3
- Kendall's Tau-c· 2022-05-25SOTA54.9
- Mean Accuracy· 2022-05-25SOTA85.2