Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Noah A. Smith

2016-09-24EMNLP 2016 11Dependency Parsing

Abstract

We introduce two first-order graph-based dependency parsers achieving a new state of the art. The first is a consensus parser built from an ensemble of independently trained greedy LSTM transition-based parsers with different random initializations. We cast this approach as minimum Bayes risk decoding (under the Hamming cost) and argue that weaker consensus within the ensemble is a useful signal of difficulty or ambiguity. The second parser is a "distillation" of the ensemble into a single model. We train the distillation parser using a structured hinge loss objective with a novel cost that incorporates ensemble uncertainty estimates for each possible attachment, thereby avoiding the intractable cross-entropy computations required by applying standard distillation objectives to problems with structured outputs. The first-order distillation parser matches or surpasses the state of the art on English, Chinese, and German.

Results

Task	Dataset	Metric	Value	Model
Dependency Parsing	Penn Treebank	LAS	92.06	Distilled neural FOG
Dependency Parsing	Penn Treebank	POS	97.44	Distilled neural FOG
Dependency Parsing	Penn Treebank	UAS	94.26	Distilled neural FOG

Related Papers

Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMs2025-06-11 UD-KSL Treebank v1.3: A semi-automated framework for aligning XPOS-extracted units with UPOS tags2025-06-10 LKD-KGC: Domain-Specific KG Construction via LLM-driven Knowledge Dependency Parsing2025-05-30 Dependency Parsing is More Parameter-Efficient with Normalization2025-05-26 FiLLM -- A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM)2025-05-25 CrosGrpsABS: Cross-Attention over Syntactic and Semantic Graphs for Aspect-Based Sentiment Analysis in a Low-Resource Language2025-05-25 Semantic-based Unsupervised Framing Analysis (SUFA): A Novel Approach for Computational Framing Analysis2025-05-21 Hierarchical Bracketing Encodings for Dependency Parsing as Tagging2025-05-16