Paper: A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction

ACL ID P11-1087
Title A Hierarchical Pitman-Yor Process HMM for Unsupervised Part of Speech Induction
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2011
Authors

In this work we address the problem of unsupervised part-of-speech induction by bringing together several strands of research into a single model. We develop a novel hidden Markov model incorporating sophisticated smoothing using a hierarchical Pitman-Yor processes prior, providing an elegant and principled means of incorporating lexical characteristics. Central to our approach is a new type-based sampling algorithm for hierarchical Pitman-Yor models in which we track fractional table counts. In an empirical evaluation we show that our model consistently out-performs the current state-of-the-art across 10 languages.