Paper: Inducing Word Sense with Automatically Learned Hidden Concepts

ACL ID C14-1035
Title Inducing Word Sense with Automatically Learned Hidden Concepts
Venue International Conference on Computational Linguistics
Session Main Conference
Year 2014
Authors

Word Sense Induction (WSI) aims to automatically induce meanings of a polysemous word from unlabeled corpora. In this paper, we first propose a novel Bayesian parametric model to WSI. Unlike previous work, our research introduces a layer of hidden concepts and view senses as mixtures of concepts. We believe that concepts generalize the contexts, allowing the model to measure the sense similarity at a more general level. The Zipf?s law of meaning is used as a way of pre-setting the sense number for the parametric model. We further extend the parametric model to non-parametric model which not only simplifies the problem of model selection but also brings improved performance. We test our model on the benchmark datasets released by Semeval-2010 and Semeval-2007. The test results show that our...