Paper: Automatic Selection of High Quality Parses Created By a Fully Unsupervised Parser

ACL ID W09-1120
Title Automatic Selection of High Quality Parses Created By a Fully Unsupervised Parser
Venue International Conference on Computational Natural Language Learning
Session Main Conference
Year 2009
Authors

The average results obtained by unsupervised statistical parsers have greatly improved in the last few years, but on many specific sentences they are of rather low quality. The output of such parsers is becoming valuable for vari- ous applications, and it is radically less expen- sive to create than manually annotated training data. Hence, automatic selection of high qual- ity parses created by unsupervised parsers is an important problem. In this paper we present PUPA, a POS-based Unsupervised Parse Assessment algorithm. The algorithm assesses the quality of a parse tree using POS sequence statistics collected from a batch of parsed sentences. We eval- uate the algorithm by using an unsupervised POS tagger and an unsupervised parser, se- lecting high quality parsed sentences from En- glis...