Paper: A Study On Automatically Extracted Keywords In Text Categorization

ACL ID P06-1068
Title A Study On Automatically Extracted Keywords In Text Categorization
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2006
Authors

This paper presents a study on if and how automatically extracted keywords can be used to improve text categorization. In summary we show that a higher perfor- mance — as measured by micro-averaged F-measure on a standard text categoriza- tion collection — is achieved when the full-text representation is combined with the automatically extracted keywords. The combination is obtained by giving higher weights to words in the full-texts that are also extracted as keywords. We also present results for experiments in which the keywords are the only input to the cat- egorizer, either represented as unigrams or intact. Of these two experiments, the unigrams have the best performance, al- though neither performs as well as head- lines only.