Guz et al. (2020)
Reported on 4 benchmarks across 1 task · 1 paper
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Natural Language Processing5 results
- Standard Parseval (Nuclearity)· 2020-11-0661.38best: 70.4 (Bottom-up Llama 2 (70B))
- Standard Parseval (Nuclearity)· 2020-11-0644.41best: 60 (Bottom-up (DeBERTa))
- Standard Parseval (Span)· 2020-11-0664.55best: 77.8 (Bottom-up (DeBERTa))
- Standard Parseval (Span)72.43best: 79.8 (Bottom-up Llama 2 (70B))
- Standard Parseval (Span)72.43best: 79.8 (Bottom-up Llama 2 (70B))