Paper: Inside-Outside Reestimation From Partially Bracketed Corpora

ACL ID P92-1017
Title Inside-Outside Reestimation From Partially Bracketed Corpora
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1992

The inside-outside algorithm for inferring the pa- rameters of a stochastic context-free grammar is extended to take advantage of constituent in- formation (constituent bracketing) in a partially parsed corpus. Experiments on formal and natu- ral language parsed corpora show that the new al- gorithm can achieve faster convergence and better modeling of hierarchical structure than the origi- nal one. In particular, over 90% test set bracket- ing accuracy was achieved for grammars inferred by our algorithm from a training set of hand- parsed part-of-speech strings for sentences in the Air Travel Information System spoken language corpus. Finally, the new algorithm has better time complexity than the original one when sufficient bracketing is provided. 1. MOTIVATION The most successful stocha...