Paper: Painless Unsupervised Learning with Features

ACL ID N10-1083
Title Painless Unsupervised Learning with Features
Venue Human Language Technologies
Session Main Conference
Year 2010
Authors

We show how features can easily be added to standard generative models for unsuper- vised learning, without requiring complex new training methods. In particular, each component multinomial of a generative model can be turned into a miniature logistic regres- sion model if feature locality permits. The in- tuitive EM algorithm still applies, but with a gradient-based M-step familiar from discrim- inative training of logistic regression mod- els. We apply this technique to part-of-speech induction, grammar induction, word align- ment, and word segmentation, incorporating a few linguistically-motivated features into the standard generative model for each task. These feature-enhanced models each outper- form their basic counterparts by a substantial margin, and even compete with and surpass m...