Paper: A Bootstrapping Method For Learning Semantic Lexicons Using Extraction Pattern Contexts

ACL ID W02-1028
Title A Bootstrapping Method For Learning Semantic Lexicons Using Extraction Pattern Contexts
Venue Conference on Empirical Methods in Natural Language Processing
Session Main Conference
Year 2002
Authors

This paper describes a bootstrapping al- gorithm called Basilisk that learns high- quality semantic lexicons for multiple cate- gories. Basilisk begins with an unannotated corpus and seed words for each semantic category, which are then bootstrapped to learn new words for each category. Basilisk hypothesizes the semantic class of a word based on collective information over a large body of extraction pattern contexts. We evaluate Basilisk on six semantic categories. The semantic lexicons produced by Basilisk have higher precision than those produced by previous techniques, with several cate- gories showing substantial improvement.