Paper: Finding Predominant Word Senses In Untagged Text

ACL ID P04-1036
Title Finding Predominant Word Senses In Untagged Text
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2004
Authors

In word sense disambiguation (WSD), the heuristic of choosing the most common sense is extremely powerful because the distribution of the senses of a word is often skewed. The problem with using the predominant, or first sense heuristic, aside from the fact that it does not take surrounding context into account, is that it assumes some quantity of hand- tagged data. Whilst there are a few hand-tagged corpora available for some languages, one would expect the frequency distribution of the senses of words, particularly topical words, to depend on the genre and domain of the text under consideration. We present work on the use of a thesaurus acquired from raw textual corpora and the WordNet similar- ity package to find predominant noun senses auto- matically. The acquired predominant senses g...