Paper: From neighborhood to parenthood: the advantages of dependency representation over bigrams in Brown clustering

ACL ID C14-1131
Title From neighborhood to parenthood: the advantages of dependency representation over bigrams in Brown clustering
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2014
Authors

We present an effective modification of the popular Brown et al. 1992 word clustering algorithm, using a dependency language model. By leveraging syntax-based context, resulting clusters are better when evaluated against a wordnet for Dutch. The improvements are stable across parameters such as number of clusters, minimum frequency and granularity. Further refinement is possible through dependency relation selection. Our approach achieves a desired clustering quality with less data, resulting in a decrease in cluster creation times.