Paper: Unsupervised Part-Of-Speech Tagging Employing Efficient Graph Clustering

ACL ID P06-3002
Title Unsupervised Part-Of-Speech Tagging Employing Efficient Graph Clustering
Venue Annual Meeting of the Association of Computational Linguistics
Session Student Session
Year 2006
Authors

An unsupervised part-of-speech (POS) tagging system that relies on graph clustering methods is described. Unlike in current state-of-the-art approaches, the kind and number of different tags is generated by the method itself. We compute and merge two partitionings of word graphs: one based on context similarity of high frequency words, another on log-likelihood statistics for words of lower frequencies. Using the resulting word clusters as a lexicon, a Viterbi POS tagger is trained, which is refined by a morphological component. The approach is evaluated on three different languages by measuring agreement with existing taggers.