Paper: Efficient Algorithms For Parsing The DOP Model

ACL ID W96-0214
Title Efficient Algorithms For Parsing The DOP Model
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 1996
Authors

Excellent results have been reported for Data- Oriented Parsing (DOP) of natural language texts (Bod, 1993c). Unfortunately, existing algorithms are both computationally intensive and difficult to implement. Previous algorithms are expen- sive due to two factors: the exponential number of rules that must be generated and the use of a Monte Carlo parsing algorithm. In this paper we solve the first problem by a novel reduction of the DOP model to:a small, equivalent probabilistic context-free grammar. We solve the second prob- lem by a novel deterministic parsing strategy that maximizes the expected number of correct con- stituents, rather than the probability of a correct parse tree. Using ithe optimizations, experiments yield a 97% crossing brackets rate and 88% zero crossing brackets rate...