Paper: Learning Word Sense With Feature Selection And Order Identification Capabilities

ACL ID P04-1080
Title Learning Word Sense With Feature Selection And Order Identification Capabilities
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2004
Authors

This paper presents an unsupervised word sense learning algorithm, which induces senses of target word by grouping its occurrences into a “natural” number of clusters based on the similarity of their contexts. For removing noisy words in feature set, feature selection is conducted by optimizing a clus- ter validation criterion subject to some constraint in an unsupervised manner. Gaussian mixture model and Minimum Description Length criterion are used to estimate cluster structure and cluster number. Experimental results show that our algorithm can find important feature subset, estimate model or- der (cluster number) and achieve better performance than another algorithm which requires cluster num- ber to be provided.