Paper: Precise N-Gram Probabilities From Stochastic Context-Free Grammars

ACL ID P94-1011
Title Precise N-Gram Probabilities From Stochastic Context-Free Grammars
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1994
Authors

We present an algorithm for computing n-gram probabil- ities from stochastic context-free grammars, a procedure that can alleviate some of the standard problems associated with n-grams (estimation from sparse data, lack of linguis- tic structure, among others). The method operates via the computation of substring expectations, which in turn is ac- complished by solving systems of linear equations derived from the grammar. The procedure is fully implemented and has proved viable and useful in practice.