Paper: Evaluation Of Automatically Identified Index Terms For Browsing Electronic Documents

ACL ID A00-1042
Title Evaluation Of Automatically Identified Index Terms For Browsing Electronic Documents
Venue Annual Conference of the North American Chapter of the Association for Computational Linguistics
Session Main Conference
Year 2000
Authors

We present an evaluation of domain- independent natural language tools for use in the identification of significant concepts in documents. Using qualitative evaluation, we compare three shallow processing methods for extracting index terms, i.e., terms that can be used to model the content of documents. We focus on two criteria: quality and coverage. In terms of quality alone, our results show that technical term (TT) extraction [Justeson and Katz 1995] receives the highest rating. How- ever, in terms of a combined quality and cover- age metric, the Head Sorting (HS) method, described in [Wacholder 1998], outperforms both other methods, keyword (KW) and TT.