Paper: Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars

ACL ID D12-1121
Title Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2012
Authors

We introduce a novel approach named unam- biguity regularization for unsupervised learn- ing of probabilistic natural language gram- mars. The approach is based on the observa- tion that natural language is remarkably unam- biguous in the sense that only a tiny portion of the large number of possible parses of a nat- ural language sentence are syntactically valid. We incorporate an inductive bias into gram- mar learning in favor of grammars that lead to unambiguous parses on natural language sentences. The resulting family of algorithms includes the expectation-maximization algo- rithm (EM) and its variant, Viterbi EM, as well as a so-called softmax-EM algorithm. The softmax-EM algorithm can be implemented with a simple and computationally efficient extension to standard EM. In our experim...