Paper: MMR-Based Feature Selection For Text Categorization

ACL ID N04-4002
Title MMR-Based Feature Selection For Text Categorization
Venue Human Language Technologies
Session Short Paper
Year 2004
Authors

We introduce a new method of feature selec- tion for text categorization. Our MMR-based feature selection method strives to reduce re- dundancy between features while maintaining information gain in selecting appropriate fea- tures for text categorization. Empirical results show that MMR-based feature selection is more effective than Koller & Sahami’s method, which is one of greedy feature selec- tion methods, and conventional information gain which is commonly used in feature selec- tion for text categorization. Moreover, MMR- based feature selection sometimes produces some improvements of conventional machine learning algorithms over SVM which is known to give the best classification accuracy.