Paper: Using Adaptor Grammars to Identify Synergies in the Unsupervised Acquisition of Linguistic Structure

ACL ID P08-1046
Title Using Adaptor Grammars to Identify Synergies in the Unsupervised Acquisition of Linguistic Structure
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2008
Authors

Adaptor grammars (Johnson et al., 2007b) are a non-parametric Bayesian extension of Prob- abilistic Context-Free Grammars (PCFGs) which in effect learn the probabilities of en- tire subtrees. In practice, this means that an adaptor grammar learns the structures useful for generating the training data as well as their probabilities. We present several differ- ent adaptor grammars that learn to segment phonemic input into words by modeling dif- ferent linguistic properties of the input. One of the advantages of a grammar-based frame- work is that it is easy to combine grammars, and we use this ability to compare models that capture different kinds of linguistic structure. We show that incorporating both unsupervised syllabification and collocation-finding into the adaptor grammar significant...