Nguyen et al. (2021)
Reported on 4 benchmarks across 1 task · 1 paper
Note: results are matched by exact model name. Different papers may use the same name for different model variants.
Natural Language Processing4 results
- Standard Parseval (Full)· 2021-05-2346.6best: 58.1 (Bottom-up Llama 2 (70B))
- Standard Parseval (Nuclearity)· 2021-05-2359.1best: 70.4 (Bottom-up Llama 2 (70B))
- Standard Parseval (Relation)· 2021-05-2347.8best: 60 (Bottom-up Llama 2 (70B))
- Standard Parseval (Span)· 2021-05-2368.4best: 79.8 (Bottom-up Llama 2 (70B))