Paper: Automatic Acquisition Of Hyponyms From Large Text Corpora

ACL ID C92-2082
Title Automatic Acquisition Of Hyponyms From Large Text Corpora
Venue International Conference on Computational Linguistics
Session Main Conference
Year 1992
Authors
  • Marti A. Hearst (University of California at Berkeley, Berkeley CA; Palo Alto Research Center, Palo Alto CA)

We describe a method for the automatic acquisition of the hyponymy lexical relation from unrestricted text. Two goals motivate the approach: (i) avoid- ance of the need for pre-encoded knowledge and (ii) applicability across a wide range of text. We identify a set of lexico-syntactic patterns that are easily rec- ognizable, that occur iYequently and across text genre boundaries, and that indisputably indicate the lexical relation of interest. We describe a method for discov- ering these patterns and suggest that other lexical relations will also be acquirable in this way. A subset of the acquisition algorithm is implemented and the results are used to attgment and critique the struc- ture of a large hand-built thesaurus. Extensions and applications to areas such as information retrieval ar...