Paper: An All-Subtrees Approach To Unsupervised Parsing

ACL ID P06-1109
Title An All-Subtrees Approach To Unsupervised Parsing
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2006
  • Rens Bod (University of St. Andrews, St. Andrews UK)

We investigate generalizations of the all- subtrees "DOP" approach to unsupervised parsing. Unsupervised DOP models assign all possible binary trees to a set of sentences and next use (a large random subset of) all subtrees from these binary trees to compute the most probable parse trees. We will test both a relative frequency estimator for unsupervised DOP and a maximum likelihood estimator which is known to be statistically consistent. We report state-of- the-art results on English (WSJ), German (NEGRA) and Chinese (CTB) data. To the best of our knowledge this is the first paper which tests a maximum likelihood estimator for DOP on the Wall Street Journal, leading to the surprising result that an unsupervised parsing model beats a widely used supervised model (a treebank PCFG).