Paper: Parsimonious Data-Oriented Parsing

ACL ID D07-1058
Title Parsimonious Data-Oriented Parsing
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2007

This paper explores a parsimonious ap- proach to Data-Oriented Parsing. While al- lowing, in principle, all possible subtrees of trees in the treebank to be productive elements, our approach aims at nding a manageable subset of these trees that can accurately describe empirical distributions over phrase-structure trees. The proposed algorithm leads to computationally much more tracktable parsers, as well as linguis- tically more informative grammars. The parser is evaluated on the OVIS and WSJ corpora, and shows improvements on ef - ciency, parse accuracy and testset likelihood. 1 Data-Oriented Parsing Data-Oriented Parsing (DOP) is a framework for statistical parsing and language modeling originally proposed by Scha (1990). Some of its innovations, although radical at the time, are now wi...