Paper: Stacking Dependency Parsers

ACL ID D08-1017
Title Stacking Dependency Parsers
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2008

We explore a stacked framework for learn- ing to predict dependency structures for natu- ral language sentences. A typical approach in graph-based dependency parsing has been to assume a factorized model, where local fea- tures are used but a global function is opti- mized (McDonald et al., 2005b). Recently Nivre and McDonald (2008) used the output of one dependency parser to provide features for another. We show that this is an example of stacked learning, in which a second pre- dictor is trained to improve the performance of the first. Further, we argue that this tech- nique is a novel way of approximating rich non-local features in the second parser, with- out sacrificing efficient, model-optimal pre- diction. Experiments on twelve languages show that stacking transition-based and graph...