Grigorii Guz, Patrick Huber, Giuseppe Carenini
RST-based discourse parsing is an important NLP task with numerous downstream applications, such as summarization, machine translation and opinion mining. In this paper, we demonstrate a simple, yet highly accurate discourse parser, incorporating recent contextual language models. Our parser establishes the new state-of-the-art (SOTA) performance for predicting structure and nuclearity on two key RST datasets, RST-DT and Instr-DT. We further demonstrate that pretraining our parser on the recently available large-scale "silver-standard" discourse treebank MEGA-DT provides even larger performance benefits, suggesting a novel and promising research direction in the field of discourse analysis.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Discourse Parsing | RST-DT | Standard Parseval (Nuclearity) | 61.86 | Guz et al. (2020) (pretrained) |
| Discourse Parsing | RST-DT | Standard Parseval (Span) | 72.94 | Guz et al. (2020) (pretrained) |
| Discourse Parsing | RST-DT | Standard Parseval (Nuclearity) | 61.38 | Guz et al. (2020) |
| Discourse Parsing | Instructional-DT (Instr-DT) | Standard Parseval (Nuclearity) | 46.59 | Guz et al. (2020) (pretrained) |
| Discourse Parsing | Instructional-DT (Instr-DT) | Standard Parseval (Span) | 65.41 | Guz et al. (2020) (pretrained) |
| Discourse Parsing | Instructional-DT (Instr-DT) | Standard Parseval (Nuclearity) | 44.41 | Guz et al. (2020) |
| Discourse Parsing | Instructional-DT (Instr-DT) | Standard Parseval (Span) | 64.55 | Guz et al. (2020) |