Paper: Domain-Specific Sense Distributions And Predominant Sense Acquisition

ACL ID H05-1053
Title Domain-Specific Sense Distributions And Predominant Sense Acquisition
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2005
Authors

Distributions of the senses of words are often highly skewed. This fact is exploited by word sense disambiguation (WSD) sys- tems which back off to the predominant sense of a word when contextual clues are not strong enough. The domain of a doc- ument has a strong influence on the sense distribution of words, but it is not feasi- ble to produce large manually annotated corpora for every domain of interest. In this paper we describe the construction of three sense annotated corpora in different domains for a sample of English words. We apply an existing method for acquir- ing predominant sense information auto- matically from raw text, and for our sam- ple demonstrate that (1) acquiring such information automatically from a mixed- domain corpus is more accurate than de- riving it from SemCo...