Paper: Unsupervised Parsing With U-DOP

ACL ID W06-2912
Title Unsupervised Parsing With U-DOP
Venue International Conference on Computational Natural Language Learning
Session Main Conference
Year 2006
  • Rens Bod (University of St. Andrews, St. Andrews UK)

We propose a generalization of the super- vised DOP model to unsupervised learning. This new model, which we call U-DOP, initially assigns all possible unlabeled binary trees to a set of sentences and next uses all subtrees from (a large subset of) these binary trees to compute the most probable parse trees. We show how U-DOP can be implemented by a PCFG-reduction tech- nique and report competitive results on English (WSJ), German (NEGRA) and Chinese (CTB) data. To the best of our knowledge, this is the first paper which accurately bootstraps structure for Wall Street Journal sentences up to 40 words obtaining roughly the same accuracy as a binarized supervised PCFG. We show that previous approaches to unsupervised parsing have shortcomings in that they either constrain the lexical or the ...