Paper: Blog Categorization Exploiting Domain Dictionary and Dynamically Estimated Domains of Unknown Words

ACL ID P08-2018
Title Blog Categorization Exploiting Domain Dictionary and Dynamically Estimated Domains of Unknown Words
Venue Annual Meeting of the Association of Computational Linguistics
Session Main Conference
Year 2008
Authors

This paper presents an approach to text cate- gorization that i) uses no machine learning and ii) reacts on-the-fly to unknown words. These features are important for categorizing Blog articles, which are updated on a daily basis and filled with newly coined words. We cat- egorize 600 Blog articles into 12 domains. As a result, our categorization method achieved an accuracy of 94.0% (564/600).