Paper: A Practical Solution To The Problem Of Automatic Word Sense Induction

ACL ID P04-3026
Title A Practical Solution To The Problem Of Automatic Word Sense Induction
Venue Annual Meeting of the Association of Computational Linguistics
Session System Demonstration
Year 2004
Authors

Recent studies in word sense induction are based on clustering global co-occurrence vec- tors, i.e. vectors that reflect the overall be- havior of a word in a corpus. If a word is se- mantically ambiguous, this means that these vectors are mixtures of all its senses. Inducing a word’s senses therefore involves the difficult problem of recovering the sense vectors from the mixtures. In this paper we argue that the demixing problem can be avoided since the contextual behavior of the senses is directly observable in the form of the local contexts of a word. From human disambiguation perform- ance we know that the context of a word is usually sufficient to determine its sense. Based on this observation we describe an algorithm that discovers the different senses of an am- biguous word by clu...