TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/DeepStruct: Pretraining of Language Models for Structure P...

DeepStruct: Pretraining of Language Models for Structure Prediction

Chenguang Wang, Xiao Liu, Zui Chen, Haoyun Hong, Jie Tang, Dawn Song

2022-05-21Findings (ACL) 2022 5Relation Extractioncoreference-resolutionDialogue State TrackingCoreference Resolutionnamed-entity-recognitionIntent DetectionNamed Entity RecognitionEvent ExtractionPredictionOpen Information ExtractionSemantic Role LabelingJoint Entity and Relation ExtractionRelation ClassificationNamed Entity Recognition (NER)Factual probeLanguage Modelling
PaperPDFCode(official)

Abstract

We introduce a method for improving the structural understanding abilities of language models. Unlike previous approaches that finetune the models with task-specific augmentation, we pretrain language models on a collection of task-agnostic corpora to generate structures from text. Our structure pretraining enables zero-shot transfer of the learned knowledge that models have about the structure tasks. We study the performance of this approach on 28 datasets, spanning 10 structure prediction tasks including open information extraction, joint entity and relation extraction, named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, factual probe, intent detection, and dialogue state tracking. We further enhance the pretraining with the task-specific training sets. We show that a 10B parameter language model transfers non-trivially to most tasks and obtains state-of-the-art performance on 21 of 28 datasets that we evaluate.

Results

TaskDatasetMetricValueModel
DialogueMULTIWOZ 2.1Joint Acc54.2DeepStruct multi-task w/ finetune
DialogueMULTIWOZ 2.1Joint Acc53.5DeepStruct multi-task
Relation ExtractionTACREDF176.8DeepStruct multi-task w/ finetune
Relation ExtractionTACREDF136.1Deepstruct zero-shot
Relation ExtractionTACREDF174.9DeepStruct multi-task
Relation ExtractionTACREDF176.8DeepStruct multi-task w/ finetune
Relation ExtractionFewRelF1 (10-way 1-shot)97.8DeepStruct multi-task w/ finetune
Relation ExtractionFewRelF1 (10-way 5-shot)99.8DeepStruct multi-task w/ finetune
Relation ExtractionFewRelF1 (5-way 1-shot)98.4DeepStruct multi-task w/ finetune
Relation ExtractionFewRelF1 (5-way 5-shot100DeepStruct multi-task w/ finetune
Relation ExtractionFewRelF1 (10-way 1-shot)92.2DeepStruct multi-task
Relation ExtractionFewRelF1 (10-way 5-shot)94.6DeepStruct multi-task
Relation ExtractionFewRelF1 (5-way 1-shot)93.6DeepStruct multi-task
Relation ExtractionFewRelF1 (5-way 5-shot96.4DeepStruct multi-task
Relation ExtractionFewRelF1 (10-way 1-shot)67.6Deepstruct zero-shot
Relation ExtractionFewRelF1 (10-way 5-shot)66.4Deepstruct zero-shot
Relation ExtractionFewRelF1 (5-way 1-shot)72.4Deepstruct zero-shot
Relation ExtractionFewRelF1 (5-way 5-shot70.8Deepstruct zero-shot
Relation ExtractionACE2005Entity F190.2DeepStruct multi-task
Relation ExtractionACE2005Relation F158.9DeepStruct multi-task
Relation ExtractionACE2005Entity F190DeepStruct multi-task w/ finetune
Relation ExtractionACE2005Relation F166.8DeepStruct multi-task w/ finetune
Relation ExtractionACE2005Entity F131.8Deepstruct zero-shot
Relation ExtractionACE2005Relation F15.3Deepstruct zero-shot
Relation ExtractionNYTEntity F195.9DeepStruct multi-task w/ finetune
Relation ExtractionNYTRelation F193.3DeepStruct multi-task w/ finetune
Relation ExtractionNYTEntity F195.4DeepStruct multi-task
Relation ExtractionNYTRelation F193.7DeepStruct multi-task
Relation ExtractionNYTEntity F160.5Deepstruct zero-shot
Relation ExtractionNYTRelation F128.6Deepstruct zero-shot
Relation ExtractionCoNLL04Entity F190.7DeepStruct multi-task w/ finetune
Relation ExtractionCoNLL04Relation F178.3DeepStruct multi-task w/ finetune
Relation ExtractionCoNLL04Entity F188.4DeepStruct multi-task
Relation ExtractionCoNLL04Relation F172.8DeepStruct multi-task
Relation ExtractionCoNLL04Entity F148.3Deepstruct zero-shot
Relation ExtractionCoNLL04Relation F125.8Deepstruct zero-shot
Relation ExtractionADE CorpusEntity F191.1DeepStruct multi-task w/ finetune
Relation ExtractionADE CorpusRelation F183.8DeepStruct multi-task w/ finetune
Relation ExtractionADE CorpusEntity F190.5DeepStruct multi-task
Relation ExtractionADE CorpusRelation F183.6DeepStruct multi-task
Relation ExtractionADE CorpusEntity F160.7Deepstruct zero-shot
Relation ExtractionADE CorpusRelation F110.6Deepstruct zero-shot
Relation ClassificationTACREDF136.1Deepstruct zero-shot
Relation ClassificationTACREDF174.9DeepStruct multi-task
Relation ClassificationTACREDF176.8DeepStruct multi-task w/ finetune
Relation ClassificationFewRelF1 (10-way 1-shot)97.8DeepStruct multi-task w/ finetune
Relation ClassificationFewRelF1 (10-way 5-shot)99.8DeepStruct multi-task w/ finetune
Relation ClassificationFewRelF1 (5-way 1-shot)98.4DeepStruct multi-task w/ finetune
Relation ClassificationFewRelF1 (5-way 5-shot100DeepStruct multi-task w/ finetune
Relation ClassificationFewRelF1 (10-way 1-shot)92.2DeepStruct multi-task
Relation ClassificationFewRelF1 (10-way 5-shot)94.6DeepStruct multi-task
Relation ClassificationFewRelF1 (5-way 1-shot)93.6DeepStruct multi-task
Relation ClassificationFewRelF1 (5-way 5-shot96.4DeepStruct multi-task
Relation ClassificationFewRelF1 (10-way 1-shot)67.6Deepstruct zero-shot
Relation ClassificationFewRelF1 (10-way 5-shot)66.4Deepstruct zero-shot
Relation ClassificationFewRelF1 (5-way 1-shot)72.4Deepstruct zero-shot
Relation ClassificationFewRelF1 (5-way 5-shot70.8Deepstruct zero-shot
Open Information ExtractionWebF143.8Deepstruct zero-shot
Open Information ExtractionWebF149.1DeepStruct multi-task w/ finetune
Open Information ExtractionWebF150.8DeepStruct multi-task
Open Information ExtractionPenn TreebankF151Deepstruct zero-shot
Open Information ExtractionPenn TreebankF154.5DeepStruct multi-task
Open Information ExtractionPenn TreebankF1451DeepStruct multi-task w/ finetune
Open Information ExtractionNYTF128.9Deepstruct zero-shot
Open Information ExtractionNYTF143.6DeepStruct multi-task
Open Information ExtractionNYTF145DeepStruct multi-task w/ finetune
Open Information ExtractionOIE2016F171.3DeepStruct multi-task w/ finetune
Open Information ExtractionOIE2016F171.2Deepstruct multi-task
Open Information ExtractionOIE2016F128.1Deepstruct zero-shot
Open Information ExtractionACE2005Argument Cl63.9DeepStruct multi-task
Open Information ExtractionACE2005Argument Id67.5DeepStruct multi-task
Open Information ExtractionACE2005Trigger Cl69.2DeepStruct multi-task
Open Information ExtractionACE2005Trigger Id72.7DeepStruct multi-task
Open Information ExtractionACE2005Argument Cl56.2DeepStruct multi-task w/ finetune
Open Information ExtractionACE2005Argument Id59.4DeepStruct multi-task w/ finetune
Open Information ExtractionACE2005Trigger Cl69.8DeepStruct multi-task w/ finetune
Open Information ExtractionACE2005Trigger Id73.5DeepStruct multi-task w/ finetune
Information ExtractionACE2005Argument Cl63.9DeepStruct multi-task
Information ExtractionACE2005Argument Id67.5DeepStruct multi-task
Information ExtractionACE2005Trigger Cl69.2DeepStruct multi-task
Information ExtractionACE2005Trigger Id72.7DeepStruct multi-task
Information ExtractionACE2005Argument Cl56.2DeepStruct multi-task w/ finetune
Information ExtractionACE2005Argument Id59.4DeepStruct multi-task w/ finetune
Information ExtractionACE2005Trigger Cl69.8DeepStruct multi-task w/ finetune
Information ExtractionACE2005Trigger Id73.5DeepStruct multi-task w/ finetune
Information ExtractionACE2005Entity F190.2DeepStruct multi-task
Information ExtractionACE2005Relation F158.9DeepStruct multi-task
Information ExtractionACE2005Entity F190DeepStruct multi-task w/ finetune
Information ExtractionACE2005Relation F166.8DeepStruct multi-task w/ finetune
Information ExtractionACE2005Entity F131.8Deepstruct zero-shot
Information ExtractionACE2005Relation F15.3Deepstruct zero-shot
Information ExtractionNYTEntity F195.9DeepStruct multi-task w/ finetune
Information ExtractionNYTRelation F193.3DeepStruct multi-task w/ finetune
Information ExtractionNYTEntity F195.4DeepStruct multi-task
Information ExtractionNYTRelation F193.7DeepStruct multi-task
Information ExtractionNYTEntity F160.5Deepstruct zero-shot
Information ExtractionNYTRelation F128.6Deepstruct zero-shot
Information ExtractionCoNLL04Entity F190.7DeepStruct multi-task w/ finetune
Information ExtractionCoNLL04Relation F178.3DeepStruct multi-task w/ finetune
Information ExtractionCoNLL04Entity F188.4DeepStruct multi-task
Information ExtractionCoNLL04Relation F172.8DeepStruct multi-task
Information ExtractionCoNLL04Entity F148.3Deepstruct zero-shot
Information ExtractionCoNLL04Relation F125.8Deepstruct zero-shot
Information ExtractionADE CorpusEntity F191.1DeepStruct multi-task w/ finetune
Information ExtractionADE CorpusRelation F183.8DeepStruct multi-task w/ finetune
Information ExtractionADE CorpusEntity F190.5DeepStruct multi-task
Information ExtractionADE CorpusRelation F183.6DeepStruct multi-task
Information ExtractionADE CorpusEntity F160.7Deepstruct zero-shot
Information ExtractionADE CorpusRelation F110.6Deepstruct zero-shot
Semantic Role LabelingCoNLL05 BrownF192.1DeepStruct multi-task w/ finetune
Semantic Role LabelingCoNLL05 BrownF192DeepStruct multi-task
Semantic Role LabelingCoNLL05 WSJF195.5DeepStruct multi-task
Semantic Role LabelingCoNLL05 WSJF195.2DeepStruct multi-task w/ finetune
Semantic Role LabelingCoNLL12F197.2DeepStruct multi-task
Semantic Role LabelingCoNLL12F196DeepStruct multi-task w/ finetune
Named Entity Recognition (NER)CoNLL03F193.1DeepStruct multi-task
Named Entity Recognition (NER)CoNLL03F193DeepStruct multi-task w/ finetune
Named Entity Recognition (NER)CoNLL03F144.4Deepstruct zero-shot
Named Entity Recognition (NER)ACE2005F186.9DeepStruct multi-task w/ finetune
Named Entity Recognition (NER)ACE2005F128.1Deepstruct zero-shot
Named Entity Recognition (NER)GENIAF180.8DeepStruct multi-task w/ finetune
Named Entity Recognition (NER)GENIAF180.2DeepStruct multi-task
Named Entity Recognition (NER)GENIAF147.2Deepstruct zero-shot
Named Entity Recognition (NER)OntoNotesF187.8DeepStruct multi-task w/ finetune
Named Entity Recognition (NER)OntoNotesF187.6DeepStruct multi-task
Named Entity Recognition (NER)OntoNotesF12.5Deepstruct zero-shot
Coreference ResolutionCoNLL12Average F173.1DeepStruct multi-task w/ finetune
Coreference ResolutionCoNLL12B371.3DeepStruct multi-task w/ finetune
Coreference ResolutionCoNLL12CEAFϕ473.1DeepStruct multi-task w/ finetune
Coreference ResolutionCoNLL12MUC74.9DeepStruct multi-task w/ finetune
Coreference ResolutionCoNLL12Average F160.6DeepStruct multi-task
Coreference ResolutionCoNLL12B357.7DeepStruct multi-task
Coreference ResolutionCoNLL12CEAFϕ460.2DeepStruct multi-task
Coreference ResolutionCoNLL12MUC63.9DeepStruct multi-task
Event ExtractionACE2005Argument Cl63.9DeepStruct multi-task
Event ExtractionACE2005Argument Id67.5DeepStruct multi-task
Event ExtractionACE2005Trigger Cl69.2DeepStruct multi-task
Event ExtractionACE2005Trigger Id72.7DeepStruct multi-task
Event ExtractionACE2005Argument Cl56.2DeepStruct multi-task w/ finetune
Event ExtractionACE2005Argument Id59.4DeepStruct multi-task w/ finetune
Event ExtractionACE2005Trigger Cl69.8DeepStruct multi-task w/ finetune
Event ExtractionACE2005Trigger Id73.5DeepStruct multi-task w/ finetune

Related Papers

Multi-Strategy Improved Snake Optimizer Accelerated CNN-LSTM-Attention-Adaboost for Trajectory Prediction2025-07-21Visual-Language Model Knowledge Distillation Method for Image Quality Assessment2025-07-21Making Language Model a Hierarchical Classifier and Generator2025-07-17VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning2025-07-17The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations2025-07-17Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities2025-07-17Assay2Mol: large language model-based drug design using BioAssay context2025-07-16Describe Anything Model for Visual Question Answering on Text-rich Images2025-07-16