Paper: Improved Unsupervised POS Induction through Prototype Discovery

ACL ID P10-1132
Title Improved Unsupervised POS Induction through Prototype Discovery
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2010
Authors

We present a novel fully unsupervised al- gorithm for POS induction from plain text, motivated by the cognitive notion of proto- types. The algorithm first identifies land- mark clusters of words, serving as the cores of the induced POS categories. The rest of the words are subsequently mapped to these clusters. We utilize morpho- logical and distributional representations computed in a fully unsupervised manner. We evaluate our algorithm on English and German, achieving the best reported re- sults for this task.