Paper: Painless Unsupervised Learning with Features

ACL ID N10-1083
Title Painless Unsupervised Learning with Features
Venue Human Language Technologies
Session Main Conference
Year 2010

We show how features can easily be added to standard generative models for unsuper- vised learning, without requiring complex new training methods. In particular, each component multinomial of a generative model can be turned into a miniature logistic regres- sion model if feature locality permits. The in- tuitive EM algorithm still applies, but with a gradient-based M-step familiar from discrim- inative training of logistic regression mod- els. We apply this technique to part-of-speech induction, grammar induction, word align- ment, and word segmentation, incorporating a few linguistically-motivated features into the standard generative model for each task. These feature-enhanced models each outper- form their basic counterparts by a substantial margin, and even compete with and surpass m...