Paper: Automatic Retrieval and Clustering of Similar Words

ACL ID P98-2127
Title Automatic Retrieval and Clustering of Similar Words
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 1998
Authors
  • Dekang Lin (University of Manitoba, Winnipeg MB)

Bootstrapping semantics from text is one of the greatest challenges in natural language learning. We first define a word similarity measure based on the distributional pattern of words. The similarity measure allows us to construct a thesaurus using a parsed corpus. We then present a new evaluation methodology for the automatically constructed the- saurus. The evaluation results show that the the- saurns is significantly closer to WordNet than Roget Thesaurus is.