Leonardo F. R. Ribeiro, Martin Schmitt, Hinrich Schütze, Iryna Gurevych
Graph-to-text generation aims to generate fluent texts from graph-based data. In this paper, we investigate two recently proposed pretrained language models (PLMs) and analyze the impact of different task-adaptive pretraining strategies for PLMs in graph-to-text generation. We present a study across three graph domains: meaning representations, Wikipedia knowledge graphs (KGs) and scientific KGs. We show that the PLMs BART and T5 achieve new state-of-the-art results and that task-adaptive pretraining strategies improve their performance even further. In particular, we report new state-of-the-art BLEU scores of 49.72 on LDC2017T10, 59.70 on WebNLG, and 25.66 on AGENDA datasets - a relative improvement of 31.8%, 4.5%, and 42.4%, respectively. In an extensive analysis, we identify possible reasons for the PLMs' success on graph-to-text tasks. We find evidence that their knowledge about true facts helps them perform well even when the input graph representation is reduced to a simple bag of node and edge labels.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Text Generation | WebNLG | BLEU | 65.05 | T5-small |
| Text Generation | WebNLG Full | BLEU | 59.7 | T5-large |
| Text Generation | WebNLG (Unseen) | BLEU | 53.67 | T5_large |
| Text Generation | WebNLG (Unseen) | METEOR | 42.26 | T5_large |
| Text Generation | WebNLG (Unseen) | chrF++ | 72.25 | T5_large |
| Text Generation | WebNLG (Unseen) | BLEU | 43.97 | BART_large |
| Text Generation | WebNLG (Unseen) | METEOR | 38.61 | BART_large |
| Text Generation | WebNLG (Unseen) | chrF++ | 66.53 | BART_large |
| Text Generation | AGENDA | BLEU | 25.66 | BART-large+ STA |
| Text Generation | AGENDA | BLEU | 23.65 | BART-large |
| Text Generation | WebNLG (All) | BLEU | 59.7 | T5_large |
| Text Generation | WebNLG (All) | METEOR | 44.18 | T5_large |
| Text Generation | WebNLG (All) | chrF++ | 75.4 | T5_large |
| Text Generation | WebNLG (All) | BLEU | 54.72 | BART_large |
| Text Generation | WebNLG (All) | METEOR | 42.23 | BART_large |
| Text Generation | WebNLG (All) | chrF++ | 72.29 | BART_large |
| Text Generation | WebNLG (Seen) | BLEU | 64.71 | T5_large |
| Text Generation | WebNLG (Seen) | METEOR | 45.85 | T5_large |
| Text Generation | WebNLG (Seen) | chrF++ | 78.29 | T5_large |
| Text Generation | WebNLG (Seen) | BLEU | 63.45 | BART_large |
| Text Generation | WebNLG (Seen) | METEOR | 45.49 | BART_large |
| Text Generation | WebNLG (Seen) | chrF++ | 77.57 | BART_large |
| Data-to-Text Generation | WebNLG | BLEU | 65.05 | T5-small |
| Data-to-Text Generation | WebNLG Full | BLEU | 59.7 | T5-large |
| Data-to-Text Generation | WebNLG (Unseen) | BLEU | 53.67 | T5_large |
| Data-to-Text Generation | WebNLG (Unseen) | METEOR | 42.26 | T5_large |
| Data-to-Text Generation | WebNLG (Unseen) | chrF++ | 72.25 | T5_large |
| Data-to-Text Generation | WebNLG (Unseen) | BLEU | 43.97 | BART_large |
| Data-to-Text Generation | WebNLG (Unseen) | METEOR | 38.61 | BART_large |
| Data-to-Text Generation | WebNLG (Unseen) | chrF++ | 66.53 | BART_large |
| Data-to-Text Generation | AGENDA | BLEU | 25.66 | BART-large+ STA |
| Data-to-Text Generation | AGENDA | BLEU | 23.65 | BART-large |
| Data-to-Text Generation | WebNLG (All) | BLEU | 59.7 | T5_large |
| Data-to-Text Generation | WebNLG (All) | METEOR | 44.18 | T5_large |
| Data-to-Text Generation | WebNLG (All) | chrF++ | 75.4 | T5_large |
| Data-to-Text Generation | WebNLG (All) | BLEU | 54.72 | BART_large |
| Data-to-Text Generation | WebNLG (All) | METEOR | 42.23 | BART_large |
| Data-to-Text Generation | WebNLG (All) | chrF++ | 72.29 | BART_large |
| Data-to-Text Generation | WebNLG (Seen) | BLEU | 64.71 | T5_large |
| Data-to-Text Generation | WebNLG (Seen) | METEOR | 45.85 | T5_large |
| Data-to-Text Generation | WebNLG (Seen) | chrF++ | 78.29 | T5_large |
| Data-to-Text Generation | WebNLG (Seen) | BLEU | 63.45 | BART_large |
| Data-to-Text Generation | WebNLG (Seen) | METEOR | 45.49 | BART_large |
| Data-to-Text Generation | WebNLG (Seen) | chrF++ | 77.57 | BART_large |
| Question Generation | GrailQA-Zero-Shot | FactSpotter | 94.77 | T5B |
| Question Generation | GrailQA-Zero-Shot | METEOR | 37.35 | T5B |
| Question Generation | GrailQA-Zero-Shot | bleu | 32.2 | T5B |
| Question Generation | GrailQA-Compositional | BLEU | 31.75 | T5B |
| Question Generation | GrailQA-Compositional | FactSpotter | 94.84 | T5B |
| Question Generation | GrailQA-Compositional | METEOR | 35.64 | T5B |
| Question Generation | GrailQA-IID | BLEU | 44.51 | T5B |
| Question Generation | GrailQA-IID | FactSpotter | 99.43 | T5B |
| Question Generation | GrailQA-IID | METEOR | 42.71 | T5B |
| KG-to-Text Generation | WebNLG (Unseen) | BLEU | 53.67 | T5_large |
| KG-to-Text Generation | WebNLG (Unseen) | METEOR | 42.26 | T5_large |
| KG-to-Text Generation | WebNLG (Unseen) | chrF++ | 72.25 | T5_large |
| KG-to-Text Generation | WebNLG (Unseen) | BLEU | 43.97 | BART_large |
| KG-to-Text Generation | WebNLG (Unseen) | METEOR | 38.61 | BART_large |
| KG-to-Text Generation | WebNLG (Unseen) | chrF++ | 66.53 | BART_large |
| KG-to-Text Generation | AGENDA | BLEU | 25.66 | BART-large+ STA |
| KG-to-Text Generation | AGENDA | BLEU | 23.65 | BART-large |
| KG-to-Text Generation | WebNLG (All) | BLEU | 59.7 | T5_large |
| KG-to-Text Generation | WebNLG (All) | METEOR | 44.18 | T5_large |
| KG-to-Text Generation | WebNLG (All) | chrF++ | 75.4 | T5_large |
| KG-to-Text Generation | WebNLG (All) | BLEU | 54.72 | BART_large |
| KG-to-Text Generation | WebNLG (All) | METEOR | 42.23 | BART_large |
| KG-to-Text Generation | WebNLG (All) | chrF++ | 72.29 | BART_large |
| KG-to-Text Generation | WebNLG (Seen) | BLEU | 64.71 | T5_large |
| KG-to-Text Generation | WebNLG (Seen) | METEOR | 45.85 | T5_large |
| KG-to-Text Generation | WebNLG (Seen) | chrF++ | 78.29 | T5_large |
| KG-to-Text Generation | WebNLG (Seen) | BLEU | 63.45 | BART_large |
| KG-to-Text Generation | WebNLG (Seen) | METEOR | 45.49 | BART_large |
| KG-to-Text Generation | WebNLG (Seen) | chrF++ | 77.57 | BART_large |