Chenguang Wang, Xiao Liu, Zui Chen, Haoyun Hong, Jie Tang, Dawn Song
We introduce a method for improving the structural understanding abilities of language models. Unlike previous approaches that finetune the models with task-specific augmentation, we pretrain language models on a collection of task-agnostic corpora to generate structures from text. Our structure pretraining enables zero-shot transfer of the learned knowledge that models have about the structure tasks. We study the performance of this approach on 28 datasets, spanning 10 structure prediction tasks including open information extraction, joint entity and relation extraction, named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, factual probe, intent detection, and dialogue state tracking. We further enhance the pretraining with the task-specific training sets. We show that a 10B parameter language model transfers non-trivially to most tasks and obtains state-of-the-art performance on 21 of 28 datasets that we evaluate.
| Task | Dataset | Metric | Value | Model |
|---|---|---|---|---|
| Dialogue | MULTIWOZ 2.1 | Joint Acc | 54.2 | DeepStruct multi-task w/ finetune |
| Dialogue | MULTIWOZ 2.1 | Joint Acc | 53.5 | DeepStruct multi-task |
| Relation Extraction | TACRED | F1 | 76.8 | DeepStruct multi-task w/ finetune |
| Relation Extraction | TACRED | F1 | 36.1 | Deepstruct zero-shot |
| Relation Extraction | TACRED | F1 | 74.9 | DeepStruct multi-task |
| Relation Extraction | TACRED | F1 | 76.8 | DeepStruct multi-task w/ finetune |
| Relation Extraction | FewRel | F1 (10-way 1-shot) | 97.8 | DeepStruct multi-task w/ finetune |
| Relation Extraction | FewRel | F1 (10-way 5-shot) | 99.8 | DeepStruct multi-task w/ finetune |
| Relation Extraction | FewRel | F1 (5-way 1-shot) | 98.4 | DeepStruct multi-task w/ finetune |
| Relation Extraction | FewRel | F1 (5-way 5-shot | 100 | DeepStruct multi-task w/ finetune |
| Relation Extraction | FewRel | F1 (10-way 1-shot) | 92.2 | DeepStruct multi-task |
| Relation Extraction | FewRel | F1 (10-way 5-shot) | 94.6 | DeepStruct multi-task |
| Relation Extraction | FewRel | F1 (5-way 1-shot) | 93.6 | DeepStruct multi-task |
| Relation Extraction | FewRel | F1 (5-way 5-shot | 96.4 | DeepStruct multi-task |
| Relation Extraction | FewRel | F1 (10-way 1-shot) | 67.6 | Deepstruct zero-shot |
| Relation Extraction | FewRel | F1 (10-way 5-shot) | 66.4 | Deepstruct zero-shot |
| Relation Extraction | FewRel | F1 (5-way 1-shot) | 72.4 | Deepstruct zero-shot |
| Relation Extraction | FewRel | F1 (5-way 5-shot | 70.8 | Deepstruct zero-shot |
| Relation Extraction | ACE2005 | Entity F1 | 90.2 | DeepStruct multi-task |
| Relation Extraction | ACE2005 | Relation F1 | 58.9 | DeepStruct multi-task |
| Relation Extraction | ACE2005 | Entity F1 | 90 | DeepStruct multi-task w/ finetune |
| Relation Extraction | ACE2005 | Relation F1 | 66.8 | DeepStruct multi-task w/ finetune |
| Relation Extraction | ACE2005 | Entity F1 | 31.8 | Deepstruct zero-shot |
| Relation Extraction | ACE2005 | Relation F1 | 5.3 | Deepstruct zero-shot |
| Relation Extraction | NYT | Entity F1 | 95.9 | DeepStruct multi-task w/ finetune |
| Relation Extraction | NYT | Relation F1 | 93.3 | DeepStruct multi-task w/ finetune |
| Relation Extraction | NYT | Entity F1 | 95.4 | DeepStruct multi-task |
| Relation Extraction | NYT | Relation F1 | 93.7 | DeepStruct multi-task |
| Relation Extraction | NYT | Entity F1 | 60.5 | Deepstruct zero-shot |
| Relation Extraction | NYT | Relation F1 | 28.6 | Deepstruct zero-shot |
| Relation Extraction | CoNLL04 | Entity F1 | 90.7 | DeepStruct multi-task w/ finetune |
| Relation Extraction | CoNLL04 | Relation F1 | 78.3 | DeepStruct multi-task w/ finetune |
| Relation Extraction | CoNLL04 | Entity F1 | 88.4 | DeepStruct multi-task |
| Relation Extraction | CoNLL04 | Relation F1 | 72.8 | DeepStruct multi-task |
| Relation Extraction | CoNLL04 | Entity F1 | 48.3 | Deepstruct zero-shot |
| Relation Extraction | CoNLL04 | Relation F1 | 25.8 | Deepstruct zero-shot |
| Relation Extraction | ADE Corpus | Entity F1 | 91.1 | DeepStruct multi-task w/ finetune |
| Relation Extraction | ADE Corpus | Relation F1 | 83.8 | DeepStruct multi-task w/ finetune |
| Relation Extraction | ADE Corpus | Entity F1 | 90.5 | DeepStruct multi-task |
| Relation Extraction | ADE Corpus | Relation F1 | 83.6 | DeepStruct multi-task |
| Relation Extraction | ADE Corpus | Entity F1 | 60.7 | Deepstruct zero-shot |
| Relation Extraction | ADE Corpus | Relation F1 | 10.6 | Deepstruct zero-shot |
| Relation Classification | TACRED | F1 | 36.1 | Deepstruct zero-shot |
| Relation Classification | TACRED | F1 | 74.9 | DeepStruct multi-task |
| Relation Classification | TACRED | F1 | 76.8 | DeepStruct multi-task w/ finetune |
| Relation Classification | FewRel | F1 (10-way 1-shot) | 97.8 | DeepStruct multi-task w/ finetune |
| Relation Classification | FewRel | F1 (10-way 5-shot) | 99.8 | DeepStruct multi-task w/ finetune |
| Relation Classification | FewRel | F1 (5-way 1-shot) | 98.4 | DeepStruct multi-task w/ finetune |
| Relation Classification | FewRel | F1 (5-way 5-shot | 100 | DeepStruct multi-task w/ finetune |
| Relation Classification | FewRel | F1 (10-way 1-shot) | 92.2 | DeepStruct multi-task |
| Relation Classification | FewRel | F1 (10-way 5-shot) | 94.6 | DeepStruct multi-task |
| Relation Classification | FewRel | F1 (5-way 1-shot) | 93.6 | DeepStruct multi-task |
| Relation Classification | FewRel | F1 (5-way 5-shot | 96.4 | DeepStruct multi-task |
| Relation Classification | FewRel | F1 (10-way 1-shot) | 67.6 | Deepstruct zero-shot |
| Relation Classification | FewRel | F1 (10-way 5-shot) | 66.4 | Deepstruct zero-shot |
| Relation Classification | FewRel | F1 (5-way 1-shot) | 72.4 | Deepstruct zero-shot |
| Relation Classification | FewRel | F1 (5-way 5-shot | 70.8 | Deepstruct zero-shot |
| Open Information Extraction | Web | F1 | 43.8 | Deepstruct zero-shot |
| Open Information Extraction | Web | F1 | 49.1 | DeepStruct multi-task w/ finetune |
| Open Information Extraction | Web | F1 | 50.8 | DeepStruct multi-task |
| Open Information Extraction | Penn Treebank | F1 | 51 | Deepstruct zero-shot |
| Open Information Extraction | Penn Treebank | F1 | 54.5 | DeepStruct multi-task |
| Open Information Extraction | Penn Treebank | F1 | 451 | DeepStruct multi-task w/ finetune |
| Open Information Extraction | NYT | F1 | 28.9 | Deepstruct zero-shot |
| Open Information Extraction | NYT | F1 | 43.6 | DeepStruct multi-task |
| Open Information Extraction | NYT | F1 | 45 | DeepStruct multi-task w/ finetune |
| Open Information Extraction | OIE2016 | F1 | 71.3 | DeepStruct multi-task w/ finetune |
| Open Information Extraction | OIE2016 | F1 | 71.2 | Deepstruct multi-task |
| Open Information Extraction | OIE2016 | F1 | 28.1 | Deepstruct zero-shot |
| Open Information Extraction | ACE2005 | Argument Cl | 63.9 | DeepStruct multi-task |
| Open Information Extraction | ACE2005 | Argument Id | 67.5 | DeepStruct multi-task |
| Open Information Extraction | ACE2005 | Trigger Cl | 69.2 | DeepStruct multi-task |
| Open Information Extraction | ACE2005 | Trigger Id | 72.7 | DeepStruct multi-task |
| Open Information Extraction | ACE2005 | Argument Cl | 56.2 | DeepStruct multi-task w/ finetune |
| Open Information Extraction | ACE2005 | Argument Id | 59.4 | DeepStruct multi-task w/ finetune |
| Open Information Extraction | ACE2005 | Trigger Cl | 69.8 | DeepStruct multi-task w/ finetune |
| Open Information Extraction | ACE2005 | Trigger Id | 73.5 | DeepStruct multi-task w/ finetune |
| Information Extraction | ACE2005 | Argument Cl | 63.9 | DeepStruct multi-task |
| Information Extraction | ACE2005 | Argument Id | 67.5 | DeepStruct multi-task |
| Information Extraction | ACE2005 | Trigger Cl | 69.2 | DeepStruct multi-task |
| Information Extraction | ACE2005 | Trigger Id | 72.7 | DeepStruct multi-task |
| Information Extraction | ACE2005 | Argument Cl | 56.2 | DeepStruct multi-task w/ finetune |
| Information Extraction | ACE2005 | Argument Id | 59.4 | DeepStruct multi-task w/ finetune |
| Information Extraction | ACE2005 | Trigger Cl | 69.8 | DeepStruct multi-task w/ finetune |
| Information Extraction | ACE2005 | Trigger Id | 73.5 | DeepStruct multi-task w/ finetune |
| Information Extraction | ACE2005 | Entity F1 | 90.2 | DeepStruct multi-task |
| Information Extraction | ACE2005 | Relation F1 | 58.9 | DeepStruct multi-task |
| Information Extraction | ACE2005 | Entity F1 | 90 | DeepStruct multi-task w/ finetune |
| Information Extraction | ACE2005 | Relation F1 | 66.8 | DeepStruct multi-task w/ finetune |
| Information Extraction | ACE2005 | Entity F1 | 31.8 | Deepstruct zero-shot |
| Information Extraction | ACE2005 | Relation F1 | 5.3 | Deepstruct zero-shot |
| Information Extraction | NYT | Entity F1 | 95.9 | DeepStruct multi-task w/ finetune |
| Information Extraction | NYT | Relation F1 | 93.3 | DeepStruct multi-task w/ finetune |
| Information Extraction | NYT | Entity F1 | 95.4 | DeepStruct multi-task |
| Information Extraction | NYT | Relation F1 | 93.7 | DeepStruct multi-task |
| Information Extraction | NYT | Entity F1 | 60.5 | Deepstruct zero-shot |
| Information Extraction | NYT | Relation F1 | 28.6 | Deepstruct zero-shot |
| Information Extraction | CoNLL04 | Entity F1 | 90.7 | DeepStruct multi-task w/ finetune |
| Information Extraction | CoNLL04 | Relation F1 | 78.3 | DeepStruct multi-task w/ finetune |
| Information Extraction | CoNLL04 | Entity F1 | 88.4 | DeepStruct multi-task |
| Information Extraction | CoNLL04 | Relation F1 | 72.8 | DeepStruct multi-task |
| Information Extraction | CoNLL04 | Entity F1 | 48.3 | Deepstruct zero-shot |
| Information Extraction | CoNLL04 | Relation F1 | 25.8 | Deepstruct zero-shot |
| Information Extraction | ADE Corpus | Entity F1 | 91.1 | DeepStruct multi-task w/ finetune |
| Information Extraction | ADE Corpus | Relation F1 | 83.8 | DeepStruct multi-task w/ finetune |
| Information Extraction | ADE Corpus | Entity F1 | 90.5 | DeepStruct multi-task |
| Information Extraction | ADE Corpus | Relation F1 | 83.6 | DeepStruct multi-task |
| Information Extraction | ADE Corpus | Entity F1 | 60.7 | Deepstruct zero-shot |
| Information Extraction | ADE Corpus | Relation F1 | 10.6 | Deepstruct zero-shot |
| Semantic Role Labeling | CoNLL05 Brown | F1 | 92.1 | DeepStruct multi-task w/ finetune |
| Semantic Role Labeling | CoNLL05 Brown | F1 | 92 | DeepStruct multi-task |
| Semantic Role Labeling | CoNLL05 WSJ | F1 | 95.5 | DeepStruct multi-task |
| Semantic Role Labeling | CoNLL05 WSJ | F1 | 95.2 | DeepStruct multi-task w/ finetune |
| Semantic Role Labeling | CoNLL12 | F1 | 97.2 | DeepStruct multi-task |
| Semantic Role Labeling | CoNLL12 | F1 | 96 | DeepStruct multi-task w/ finetune |
| Named Entity Recognition (NER) | CoNLL03 | F1 | 93.1 | DeepStruct multi-task |
| Named Entity Recognition (NER) | CoNLL03 | F1 | 93 | DeepStruct multi-task w/ finetune |
| Named Entity Recognition (NER) | CoNLL03 | F1 | 44.4 | Deepstruct zero-shot |
| Named Entity Recognition (NER) | ACE2005 | F1 | 86.9 | DeepStruct multi-task w/ finetune |
| Named Entity Recognition (NER) | ACE2005 | F1 | 28.1 | Deepstruct zero-shot |
| Named Entity Recognition (NER) | GENIA | F1 | 80.8 | DeepStruct multi-task w/ finetune |
| Named Entity Recognition (NER) | GENIA | F1 | 80.2 | DeepStruct multi-task |
| Named Entity Recognition (NER) | GENIA | F1 | 47.2 | Deepstruct zero-shot |
| Named Entity Recognition (NER) | OntoNotes | F1 | 87.8 | DeepStruct multi-task w/ finetune |
| Named Entity Recognition (NER) | OntoNotes | F1 | 87.6 | DeepStruct multi-task |
| Named Entity Recognition (NER) | OntoNotes | F1 | 2.5 | Deepstruct zero-shot |
| Coreference Resolution | CoNLL12 | Average F1 | 73.1 | DeepStruct multi-task w/ finetune |
| Coreference Resolution | CoNLL12 | B3 | 71.3 | DeepStruct multi-task w/ finetune |
| Coreference Resolution | CoNLL12 | CEAFϕ4 | 73.1 | DeepStruct multi-task w/ finetune |
| Coreference Resolution | CoNLL12 | MUC | 74.9 | DeepStruct multi-task w/ finetune |
| Coreference Resolution | CoNLL12 | Average F1 | 60.6 | DeepStruct multi-task |
| Coreference Resolution | CoNLL12 | B3 | 57.7 | DeepStruct multi-task |
| Coreference Resolution | CoNLL12 | CEAFϕ4 | 60.2 | DeepStruct multi-task |
| Coreference Resolution | CoNLL12 | MUC | 63.9 | DeepStruct multi-task |
| Event Extraction | ACE2005 | Argument Cl | 63.9 | DeepStruct multi-task |
| Event Extraction | ACE2005 | Argument Id | 67.5 | DeepStruct multi-task |
| Event Extraction | ACE2005 | Trigger Cl | 69.2 | DeepStruct multi-task |
| Event Extraction | ACE2005 | Trigger Id | 72.7 | DeepStruct multi-task |
| Event Extraction | ACE2005 | Argument Cl | 56.2 | DeepStruct multi-task w/ finetune |
| Event Extraction | ACE2005 | Argument Id | 59.4 | DeepStruct multi-task w/ finetune |
| Event Extraction | ACE2005 | Trigger Cl | 69.8 | DeepStruct multi-task w/ finetune |
| Event Extraction | ACE2005 | Trigger Id | 73.5 | DeepStruct multi-task w/ finetune |