Paper: Efficient Algorithms For Parsing The DOP Model

ACL ID W96-0214
Title Efficient Algorithms For Parsing The DOP Model
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 1996

Excellent results have been reported for Data- Oriented Parsing (DOP) of natural language texts (Bod, 1993c). Unfortunately, existing algorithms are both computationally intensive and difficult to implement. Previous algorithms are expen- sive due to two factors: the exponential number of rules that must be generated and the use of a Monte Carlo parsing algorithm. In this paper we solve the first problem by a novel reduction of the DOP model to:a small, equivalent probabilistic context-free grammar. We solve the second prob- lem by a novel deterministic parsing strategy that maximizes the expected number of correct con- stituents, rather than the probability of a correct parse tree. Using ithe optimizations, experiments yield a 97% crossing brackets rate and 88% zero crossing brackets rate...